Glucuronide repressors and uses thereof

ABSTRACT

Clones containing a sequence encoding a glucuronide repressor are described. The nucleotide and amino acid sequences of a repressor (gusR) are presented. A glucuronide repressor is used to control expression of a transgene, detect glucuronides in a sample, and isolate glucuronides from a sample, among other uses.

CROSS-RELATED APPLICATIONS

[0001] The present application claims priority from U.S. ProvisionalApplication No. 60/020,621 filed Jun. 26, 1996, which application isincorporated herein by reference in its entirety.

TECHNICAL FIELD

[0002] The present invention relates generally to a repressor moleculefor a glucuronidase operon and, more specifically, to amino acid and DNAsequences of a repressor and uses for a repressor protein.

BACKGROUND OF THE INVENTION

[0003] The natural habitat of E. coli is the gut, and theβ-glucuronidase activity of E. coli plays a specific and very importantrole in its natural history. The gut is a rich source of glucuronic acidcompounds, providing a carbon source that can be efficiently exploitedby E. coli. Glucuronide substrates are taken up by E. coli via aspecific transporter, the glucuronide permease (U.S. Pat. Nos. 5,288,463and 5,432,081) and cleaved by β-glucuronidase. The glucuronic acidresidue thus released is used as a carbon source.

[0004] In general, the aglycon component of the glucuronide substrate isnot used by E. coli and passes back across the bacterial membrane intothe gut to be reabsorbed into the bloodstream. This circulation ofhydrophobic compounds resulting from the opposing processes ofglucuronidation in the liver and deglucuronidation in the gut is termedenterohepatic circulation. This phenomenon is of great physiologicalimportance because it means that, due in large part to the action ofmicrobial β-glucuronidase, many compounds including endogenous steroidhormones and exogenously administered drugs are not eliminated from thebody all at once. Rather, the levels of these compounds in thebloodstream oscillate due to this circulatory process. This process isof great significance in determining pharmaceutical dosages, and indeedsome drugs are specifically administered as the glucuronide conjugate,relying on the action of β-glucuronidase to release the active aglycon(Draser and Hill, 1974).

[0005] β-glucuronidase is encoded by the gusA locus of E. coli (Noveland Novel, Mol. Gen. Genet. 120:319-335, 1973). gusA (GUS) is one memberof an operon, consisting of three protein-encoding genes. The secondgene, gusB (PER), encodes a specific permease for β-glucuronides. Thethird gene, gusC (MOP), encodes an outer membrane protein ofapproximately 50 kDa that facilitates access of glucuronides to thepermease located in the inner membrane. The principle repressor for thegus operon, gusR, maps immediately upstream of the operon.

[0006] β-glucuronidase activity is not constitutively expressed in E.coli; rather, transcription of the operon is regulated by severalfactors. The primary mechanism of control is induction by glucuronidesubstrates. This regulation is due to the action of the product of thegusR (formerly uidR) gene which encodes the repressor. gusR was mappedby deletion mutation analysis to the same region of the chromosome asgusA, lying upstream of gusA. GusR repression of β-glucuronidaseactivity has been shown by Northern analysis to be mediated bytranscriptional regulation: RNA from uninduced cultures of E. coli doesnot hybridize to a gusA probe, in contrast to the strong hybridizationobserved to RNA extracted from cultures that had been induced withmethyl β-D-glucuronide (Jefferson, DNA Transformation of Caenorhabditiselegans: Development and Application of a New Gene Fusion System. Ph.D.Dissertation, University of Colorado, Boulder, Colo., 1985). Presumably,therefore, GusR represses gusA transcription by binding to gusA operatorsequences, thereby preventing transcription. This repression would thenbe relieved when a glucuronide substrate binds to the repressor andinactivates it.

[0007] The present invention provides gene and protein sequences ofglucuronide repressors and use of the repressor for controlling geneexpression and detecting glucuronides, while providing other relatedadvantages.

SUMMARY OF THE INVENTION

[0008] This invention generally provides isolated nucleic acid moleculesencoding a glucuronide repressor. In particular, a nucleotide and aminoacid sequence of the E. coli glucuronide repressor (gusR) are provided.In preferred embodiments, the nucleotide sequence of the repressor ispresented in SEQ. ID. NO: 1 or a variant thereof. In certainembodiments, nucleic acid molecules that hybridize to gusR are provided.Nucleic acid sequences that encode glucuronide binding site of aglucuronide repressor are presented.

[0009] In another aspect, this invention provides a glucuroniderepressor protein that binds to a glucuronide operator and that binds toa glucuronide, wherein the binding to the operator is inverselydependent on glucuronide binding. In certain preferred embodiments therepressor comprises the sequence presented in SEQ. ID NO: 2 or a variantthereof. In other preferred embodiments, the repressor comprises afusion protein of a glucuronide binding site or domain and anucleotide-binding domain.

[0010] In yet other aspects, methods for isolating a glucuronide areprovided, comprising (a) contacting a glucuronide binding domain from aglucuronide with a sample containing a glucuronide, wherein theglucuronide binds to the repressor protein; and (b) eluting theglucuronide from the repressor.

[0011] Other aspects provide methods for determining the presence ordetecting the presence of a glucuronide in a sample, comprising (a)binding a repressor protein to a nucleic acid molecule comprising aglucuronide operator sequence to form a complex; (b) contacting thecomplex with a sample containing a glucuronide, wherein the glucuronidebinds to the repressor protein causing release of the protein from thenucleic acid molecule; and (c) detecting release of the protein.

[0012] In other aspects, methods are provided for controlling geneexpression of a transgene, comprising (a) transfecting or transforming acell with a nucleic acid molecule comprising a nucleotide sequenceencoding the repressor protein, a glucuronide operator sequence, and atransgene, wherein the operator is operably linked to the transgene; and(b) contacting the cell with a glucuronide that binds to the repressorprotein; wherein the glucuronide causes the repressor protein to releasefrom the operator sequence, thereby allowing expression of thetransgene.

[0013] In yet other aspects, methods are provided for identifying avertebrate glucuronide transport protein, comprising doubly transfectinga host cell lacking transport activity with a reporter gene undercontrol of a glucuronide repressor and an expression library constructedfrom vertebrate RNA, and screening for expression of the reporter genein the presence of a glucuronide.

[0014] These and other aspects of the present invention will becomeevident upon reference to the following detailed description andattached drawings. In addition, various references are set forth hereinwhich describe in more detail certain procedures or compositions (e.g.,plasmids. etc.). and are incorporated by reference in their entirety.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015]FIG. 1 is a drawing depicting the gus operon of E. coli and theactivity of the gus proteins on a β-glucuronide.

[0016]FIG. 2 shows the reaction catalyzed by β-glucuronidase andexamples of various substrates useful for detection of GUS activity.

[0017]FIG. 3 is a map of pKW223. This plasmid contains a 1.4 kbBstXI-NcoI fragment harboring the gusR gene.

[0018]FIG. 4 is a schematic depicting two glucuronide repressorexpression systems. The upper figure shows constructs used in aglucuronide (R-glcA) dependent expression system. The lower figure showsconstructs used in a glucuronide repressed expression system. O,operator sequence; pA, polyadenylation signal; gusR fusion, a fusionprotein comprising a DNA binding domain, a glucuronide binding domainand a transcriptional activation domain.

[0019]FIG. 5 depicts the enterohepatic circulation of glucuronideconjugates.

[0020]FIG. 6 is a map of the region of the gus operon claimed as a BamHIfragment.

[0021]FIG. 7 is a restriction map of pKW244.

[0022]FIG. 8 depicts the strategy of an operator/repressor experiment. Ahigh copy plasmid containing an operator site is introduced into a cellwith a gus operon located on the E. coli chromosome. The operator bindsavailable repressor allowing transcription of the gus operon.

[0023]FIG. 9 shows an example of an operator/repressor titrationexperiment. A: DH5α cells transformed with pBSIISK+ and plated on LBmedia containing X-gluc. B: DH5α cells transformed with pKW244 andplated on LB media containing X-gluc. The gus operon is induced as shownby the presence of blue colonies.

[0024]FIG. 10 is a restriction map of pMEL1.

[0025]FIG. 11 is a restriction map of pMEL3.

[0026]FIG. 12 is a restriction map of pMEL4.

[0027]FIG. 13 is a restriction map of pMEL5.

[0028]FIG. 14 is a restriction map of pMEL8.

[0029]FIG. 15 diagrams subclones of the gus operon regulatory region andshows relative repressor titration of these subclones in DH5α expressedas a percentage of pKW244 titration.

[0030]FIG. 16 depicts the location and sequence of the HpaI centeredpalindrome upstream of gusA.

[0031]FIG. 17 depicts the location and sequence of the HpaI centeredpalindrome located upstream of gusR.

[0032]FIG. 18 depicts the location and sequence of the Psp1406Ipalindromes upstream of gusA.

[0033]FIG. 19 diagrams subclones of the gus operon regulatory region andshows relative repressor titration of these subclones in ER1648,expressed as a percentage of pKW244 titration.

[0034]FIG. 20 shows a restriction map of pKW224.

[0035]FIG. 21 shows a restriction map of pMEL101.

[0036]FIG. 22 is a photograph of a protein gel showing overexpression ofa 26 kDa gusR/lacZ fusion protein from pMEL101 and a 22 kDa gusR proteinfrom pMEL103.

[0037]FIG. 23 shows a restriction map of pMEL103.

[0038]FIG. 24 is a photograph of protein gel showing overexpression of a26 kDa gusR/lacZ fusion protein (indicated with arrow on right side)from pKW241 and a 22 kDa gusR protein (indicated with arrow on leftside) from pKW288 and pKW289.

[0039]FIG. 25 is a computer image of a protein gel showing purificationof gusR on a Sepharose CL6B column coupled withphenylthio-β-D-glucuronide. Lane 1, protein size markers: lane 2, sampleflow-through; lane 3, fraction collected from first buffer wash; lane 4,fraction collected from second buffer wash; lane 5, gusR standard; lane6, first fraction collected from elution with 0.1 M NaCl; lane 7, secondfraction collected from elution with 0.1 M NaCl; lane 8, first fractioncollected from elution with 0.3 M NaCl: lane 9, second fractioncollected from elution with 0.3 M NaCl.

[0040]FIG. 26 is a computer image of a protein gel showing purificationof gusR on an agarose column coupled with saccharolactone. Lane 1,protein size markers; lane 2, sample flow-through: lane 3, fractioncollected from first buffer wash; lane 4, fraction collected from secondbuffer wash; lane 5, fraction collected from elution with 0.1 M NaCl;lane 6, second fraction collected from elution with 0.5 M NaCl; lane 7,gusR standard.

[0041]FIG. 27 is a computer image of a protein gel showing purificationof hexahistidine-modified gusR from an induced culture on a Sepharosecolumn coupled with nickel. Lane 1, first elution using 10 mM EDTA inIMAC buffer; lane 2, second elution using 10 mM EDTA in IMAC buffer:lane 3, third elution using 10 mM EDTA in IMAC buffer.

DETAILED DESCRIPTION OF THE INVENTION

[0042] Prior to setting forth the invention, it may be helpful to anunderstanding thereof to set forth definitions of certain terms thatwill be used hereinafter.

[0043] As used herein, “glucuronide” or “β-glucuronide” refers to anyaglycon conjugated in a hemiacetal linkage, typically through thehydroxyl group, to the C1 of a free D-glucuronic acid in the βconfiguration. Glucuronides are generally very water soluble, due to theionizable carboxylic acid group at the 6-carbon position in the glycon.Most aromatic and aliphatic glucuronides are remarkably stable relativeto other types of glycoside conjugates, which may be due to theinductive effect of the carbonyl group at C-6 on the hemiacetal linkageat C-1. For example, colorigenic and fluorogenic substrates, such asp-nitrophenyl β-D-glucuronide, and 4-methylumbelliferyl β-D-glucuronide,are much more stable in aqueous solution than the correspondingβ-D-galactosides or β-D-glucosides, making background due to spontaneoushydrolysis much less of a problem. Many β-glucuronides can be preparedfree of other contaminating glycosides by vigorous acid hydrolysis,which cleaves glucosides, galactosides and other glycosides, but leavesmost glucuronides intact. For example, complex carbohydrate polymerssuch as gum arabic can be reduced to a collection of monosaccharidecomponents, and the single β-glucuronyl disaccharide aldobiuronic acid,simply by boiling gum arabic in sulfuric acid overnight.

[0044] β-glucuronides consist of virtually any compound linked to the1-position of glucuronic acid as a beta anomer, and are typically,though by no means exclusively, found as the —O-glycoside.β-glucuronides are produced naturally through the action ofUDP-glucuronyl transferase in many cells and tissues by most vertebratesas a part of the process of solubilizing, detoxifying, and mobilizingboth natural and xenobiotic compounds, and thus directing them to sitesof excretion or activity through the circulatory system. E. coli is ableto cleave such glucuronides into their constituent molecules and use theglucuronic acid as an energy source through metabolism by thehexuronide-hexuronate pathway.

[0045] β-glucuronides in polysaccharide form are common in nature, mostabundantly in vertebrates, where they are major constituents ofconnective and lubricative tissues (e.g., chondroitan sulfate ofcartilage, and hyaluronic acid, which is the principle constituent ofsynovial fluid and mucus) in polymeric form with other sugars such asN-acetylglucosamine. β-glucuronides are relatively uncommon in plants.However, some plant gums and mucilages produced by wounded trees,notably gum arabic from Acacia senegal, do contain significant amountsof β-glucuronides in polymeric form, although rarely if ever as terminalresidues that would serve as GUS substrates. Glucuronides andgalacturonides found in plant cell wall components (such as pectin) aregenerally in the alpha configuration, and are frequently substituted asthe 4-O-methyl ether; hence, these are not substrates forβ-glucuronidase.

[0046] Within the context of this invention, certain β-glucuronidederivatives are used. Such β-glucuronide derivatives have the formula(1):

[0047] wherein R₁ is an aglycon moiety, R₂ is a hydrophobic moiety, andL₁ and L₂ are independently selected from linking groups. Preferredlinking groups are independently selected from a direct bond, —O—,—OC(═O)—, —C(═O)O—, —C(═O)—, —CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—,—C(═O)N(R₃)—, —N(R₃)C(═O)O—, —OC(═O)N(R₃)—, —S—, and —SS—, where R₃ is Hor a C₁-C₂₂ hydrocarbon group.

[0048] In a first embodiment: R₁ is an aglycon moiety; L₁ is selectedfrom a direct bond, —O—, —OC(═O)—, —C(═O)O—, —C(═O)—, —CH(OR₃)—,—N(R₃)—, —N(R₃)C(═O)—, —C(═O)N(R₃)—, —N(R₃)C(═O)O—, —OC(═O)N(R₃)—, —S—,and —SS—; R₂ is a hydrophobic moiety; L₂ is selected from a direct bond,—O—, —OC(═O)—, —C(═O)—, —N(R₃)—, —N(R₃)C(═O)—, and —S—; and R₃ is H or aC₁-C₂₂ hydrocarbon group.

[0049] In a preferred first embodiment: R₁ is an aglycon moiety; L₁ isselected from a direct bond, —O—, —OC(═O)—, —C(═O)O—, —C(═O)—,—CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—, —C(═O)N(R₃)—, —N(R₃)C(═O)O—,—OC(═O)N(R₃)—, —S—, and —SS—; R₂ is a lipid(—CH₂—CH(OC(═O)R₃)—CH₂(OC(═O)R₃) or a C₁-C₂₂ hydrocarbon group; L₂ isselected from a direct bond, —O—, —OC(═O)—, —C(═O)—, —N(R₃)—,—N(R₃)C(═O)—, and —S—; and R₃ is H or a C₁-C₂₂ hydrocarbon group.

[0050] In a more preferred first embodiment: R₁ is an aglycone moiety;L₁ is selected from a direct bond, —O—, —OC(═O)—, —C(═O)O—, —C(═O)—,—CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—, —C(═O)N(R₃)—, —N(R₃)C(═O)O—,—OC(═O)N(R₃)—, —S—, and —SS—; R₂ is selected from C₁-C₂₂alkyl,C₆-C₂₂aryl, C₃-C₂₂cycloalkyl, C₇-C₂₂arylalkyl, C₇-C₂₂alkylaryl andunsaturated derivatives thereof; L₂ is selected from a direct bond, —O—,and —N(R₃)—; and R₃ is H.

[0051] In a second embodiment: R₁ is an aglycone moiety; L₁ is anon-cleavable linkage selected from a direct bond, —OC(═O)—, —C(═O)O—,—C(═O)—, —CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—, —C(═O)N(R₃)—, —N(R₃)C(═O)O—,—OC(═O)N(R₃)—, —S—, and —SS—; R₂ is a hydrophobic group; L₂ is selectedfrom a direct bond, —O—, —OC(═O)—, —C(═O)—, —N(R₃)—, —NHC(═O)—, and —S—;and R₃ is H or a C₁-C₂₂ hydrocarbon group.

[0052] In a preferred second embodiment: R₁ is a fluorogenic orchromogenic moiety; L₁ is a non-cleavable linkage selected from a directbond, —OC(═O)—, —C(═O)O—, —C(═O)—, —CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—,—C(═O)N(R₃)—, —N(R₃)C(═O)O—, —OC(═O)N(R₃)—, —S—, and —SS—; R₂ is ahydrophobic group; L₂ is selected from a direct bond, —O—, —OC(═O)—,—C(═O)—, —N(R₃)—, —NHC(═O)—, and —S—; and R₃ is H or a C₁-C₂₂hydrocarbon group.

[0053] In a more preferred second embodiment: R₁ is a fluorogenic moietyselected from 4-methylumbelliferone, 3-cyano-4-methylumbelliferone,4-trifluoromethylumbeliferone, fluorescein, 3-O-methylfluorescein andresorufin, or a chomrogenic moiety selected from5-bromo-4-chloro-3-indoxyl, naphthol ASBI, phenolphthalein andp-nitrophenol; L₁ is selected from a direct bond, —N(R₃)—, and —S—; R₂is a hydrophobic group; L₂ is selected from a direct bond, —O—,—OC(═O)—, —C(═O)—, —N(R₃)—, —NHC(═O)—, and —S—; and R₃ is H.

[0054] In a third embodiment: R₁ is an aglycon moiety; L₁ is anon-cleavable linkage selected from a direct bond, —OC(═O)—, —C(═O)O—,—C(═O)—, —CH(OR₃)—, —N(R₃)—, —N(R₃)C(═O)—, —C(═O)N(R₃)—, —N(R₃)C(═O)O—,—OC(═O)N(R₃)—, —S—, and —SS—; R₂ is a hydrophobic group; L₂ is selectedfrom a direct bond, —O—, —OC(═O)—, —C(═O)—, —N(R₃)—, —NHC(═O)—, and —S—;and R₃ is H or a C₁-C₂₂ hydrocarbon group.

[0055] In a preferred third embodiment: R₁ is a fluorogenic or achomrogenic moiety; L₁ is selected from a direct bond, —N(R₃)—, and —S—;R₂ is a lipid (—CH₂—CH(OC(═O)R₃)—CH₂(OC(═O)R₃) or a C₁-C₂₂ hydrocarbongroup; L₂ is selected from a direct bond, —O—, and —N(R₃)—; and R₃ is H.

[0056] In a more preferred third embodiment: R₁ is a fluorogenic moietyselected from 4-methylumbelliferone, 3-cyano-4-methylumbelliferone,4-trifluoromethylumbeliferone, fluorescein, 3-O-methylfluorescein andresorufin, or a chomrogenic moiety selected from5-bromo-4-chloro-3-indoxyl, naphthol ASBI, phenolphthalein andp-nitrophenol; L₁ is selected from a direct bond, —N(R₃)—, and —S—; R₂is selected from C₁-C₂₂alkyl, C6-C₂₂aryl, C₃-C₂₂cycloalkyl,C₇-C₂₂arylalkyl, C-C₂₂alkylaryl and unsaturated derivatives thereof; L₂is selected from a direct bond, —O—, and —N(R₃)—; and R₃ is H.

[0057] Compounds of formula (1) may be prepared by methodology known inthe art. The compound of formula (1) wherein —L₁—R₁ and —L₂—R₂ are both—OH is known as glucuronic acid, and is commercially available from manysources. Also commercially available are some glucuronic acidderivatives wherein R₁ is a fluorogenic or chromogenic moiety. In orderto provide compounds of formula (1) wherein —L₂—R₂ is other than —OH,the parent glucuronic acid may be esterified with an alcohol R₂—OH (toprovide compounds wherein L₂ is oxygen), or reacted with an amineR₂—N(R₃)H, to provide amide compounds (L₂ is N(R₃ )). Other derivativesmay be prepared by procedures known in the art. See, e.g., AdvancedOrganic Chemistry (3rd edition) by J. March (McGraw-Hill Book Company).In some instances, the hydroxyl groups of the pyran ring in formula (1)may need to be protected, but this may be accomplished by knownsynthetic methodology. See, e.g., Greene, “Protective Groups in OrganicChemistry”, John Wiley & Sons, New York N.Y. (1981).

[0058] As used herein, a “glucuronide operon” or a “GUS operon” refersto the concert of enzymes involved in transporting and cleavingβ-glucuronides and the regulatory sequences. In E. coli, the operoncomprises a repressor (gusR), a promoter/operator sequence,β-glucuronidase (gusA or GUS), β-glucuronide permease (gusB), and amembrane protein (gusC) (see, FIG. 1). Glucuronide operons or thevertebrate equivalent are found in most vertebrates and many mollusks(Levvy and Conchie, in Glucuronic Acid, Free and Combined, Dutton, G.J., ed. Academic Press, New York. 301, 1966). In contrast, glucuronideoperons are largely, if not completely, absent from higher plants,mosses, ferns, insects, fungi, molds, and most bacterial genera, E. coliand Shigella being exceptions.

[0059] As used herein, a “glucuronide repressor” refers to a proteinthat has at least two interacting domains, one that binds a specific DNAsequence, and the other that binds a β-glucuronide or β-glucuronidederivative, such that the DNA binding is dependent upon β-glucuronide(or derivative) binding. The interaction may cause the protein torelease from the glucuronide operator, as for a classical bacterialrepressor, or bind to the operator as for a typical eukaryotictranscriptional activator. In addition, the repressor may have a thirddomain that allows dimerization of the protein. As noted above, mostvertebrates and some mollusks have β-glucuronidase activity. Thebacterial species, E. coli and Shigella, have a glucuronide repressor.In addition to referring to a glucuronide repressor from differentspecies, glucuronide repressor also encompasses variants, includingalleles, thereof. For certain embodiments, a variant, including anallele, must bind a β-glucuronide. For other embodiments, a variant mustbind a glucuronide operator sequence. A variant may be a portion of therepressor and/or contain amino acid substitutions, insertions, anddeletions. A variant may also be sufficiently similar in nucleotidesequence to hybridize to the native sequence.

[0060] As used herein, a “glucuronide operator” or “glucuronide operatorsequence” refers to the specific nucleotide sequence bound by aglucuronide repressor. For example, the region containing theglucuronide operator sequence in E. coli is shown in SEQ ID NO: 3. Moreprecise mapping of the operator site is discussed below and is presentedin FIG. 18. The operator sequence may have nucleotide changes fromnative sequence as long as the repressor binds. Some changes may causeincreased affinity of the repressor, some may cause decreased affinity.In general, increased affinity is preferred within the context of thisinvention.

[0061] As used herein, “β-glucuronidase” refers to an enzyme thatcatalyzes the hydrolysis of β-glucuronides and derivatives. Almost anyβ-D-glucuronide serves as a substrate. For assays to detectβ-glucuronidase activity, fluorogenic or chromogenic substrates arepreferred. Such substrates include, but are not limited to,p-nitrophenyl β-D-glucuronide and 4-methylumbelliferyl β-D-glucuronide,and the glucuronide conjugates of the R—OH groups depicted in FIG. 2.Assays for β-glucuronidase activity, also known as GUS activity areprovided in U.S. Pat. No. 5,268,463.

[0062] A. Repressor Gene and Gene Product

[0063] As noted above, this invention provides gene sequence and geneproduct for a glucuronide repressor. Glucuronide repressor genes may beisolated by genetic, biochemical, or immunological methods. Some of thesuitable nucleic acid molecules include either DNA, RNA, or hybridmolecules that encode a protein comprising the amino acid sequencedepicted in SEQ ID No. 2 or variants thereof, that hybridize understringent conditions (e.g., 5× SSPE, 0.5% SD, 1× Denhardt's at 65° C. orequivalent conditions; see, Ausubel supra. Sambrook, supra) to thecomplement of the nucleotide sequence depicted in SEQ ID No. 1, that arecodon optimized for a particular host species and which encode aglucuronide repressor as discussed herein or variants thereof, andmolecules that hybridize under stringent conditions to the complement ofthe codon optimized molecule.

[0064] As exemplified herein, a gene encoding a E. coli glucuroniderepressor was identified genetically and by DNA sequence analysis. Otherglucuronide repressors may be identified in genomic or cDNA libraries bycross-hybridization with the E. coli repressor gene sequence, bycomplementation, by function, or by antibody screening on an expressionlibrary (see Sambrook et al., infra Ausubel et al. infra for methods andconditions appropriate for isolation of a glucuronide repressor fromother species). Merely as an example, the isolation of the E. coliglucuronide repressor is provided herein.

[0065] Glucuronide Repressor Genes and Proteins

[0066] The existence of a glucuronide repressor in E. coli (gusR) wasestablished by genetic and biochemical experiments and geneticallymapped to a region upstream of the glucuronidase gene (gusA). Moreover,gusR repression of β-glucuronidase activity has been shown by Northernanalysis to down-regulate transcription of gusA. RNA from uninducedcultures of E. coli showed no hybridization to a gusA probe, in contrastto the strong hybridization observed to RNA extracted from cultures thathad been induced by methyl β-D-glucuronide (Jefferson, supra). GusR actsby binding to gusA operator sequences, thus preventing transcription,this repression being relieved when a glucuronide substrate binds to therepressor and inactivates it.

[0067] A chromosomal region of E. coli known to encode gusA (encodingbeta-glucuronidase, U.S. Pat. No. 5,268,463) and gusB (encoding theglucuronide permease, U.S. Pat. No. 5,432,081) was cloned as a PstI-Hind III fragment from digested E. coli genomic DNA into a low-copyplasmid vector pRK404 (pKW212) or a high copy vector, pBSIISK+ (pKW214).It had previously been shown that cloning a smaller fragment containingthe gusA and gusB genes alone gave rise to high levels of constitutiveGUS activity as measured in cell extracts using the substratep-nitrophenyl-glucuronide. However, clones pKW212 and pKW214, extendingseveral kilobases in either direction of gusA and gusB, did not giveconstitutive activity, but required induction by addition of a GUSsubstrate, such as p-nitrophenyl-glucuronide. Thus, the larger PstI-Hind III DNA fragment contained a gene capable of repressing thetranscription of gusA and gusB, and the repression could be relieved bythe addition of a substrate molecule.

[0068] Two subclones of the Pst I-Hind III fragment of pKW212 weregenerated, the first being a large EcoR I-Hind III fragment known tocomprise the gus promoter and the gusABC genes (pKW222). The secondsubclone was constructed from an approximately 1.4 kb BstX I-Nco Ifragment, which extended from a BstX I site 3′ of the Pst I site to anNco I site downstream of the unique EcoR I site. This fragment, whichmapped upstream of gusA, was cloned to create pKW223 (FIG. 3).

[0069] pKW222, when transformed into a strain deleted for the entire gusoperon region (KW1) shows a high level of constitutive GUS activity.However, when this transformed strain is further transformed with thecompatible plasmid pKW223, virtually all the activity is eliminated,indicating that pKW223 comprises a gene or DNA sequence which canrepress the expression of the gus operon. Moreover, this repression isreversible by addition of a suitable inducer molecule such as X-glcA(5-bromo-4-chloro-3-indolyl-β-D-glucuronide). This is demonstrated bythe production of deep blue colonies when plated on the indigogenicsubstrate X-glcA.

[0070] The DNA sequence of the GUS gene region was determined from theinserts of pKW222 and pKW223 and is presented in SEQ. ID NO: 4. ThegusABC genes were identified, and coding sequence for gusA begins atnucleotide 1466. Two large open reading frames 5′ of gusA wereidentified at nucleotides 1-264 and 485-1075. The 5′-most reading framewas identified as a partial coding sequence for 7-alpha-hydroxysteroiddehydrogenase. The predicted amino acid sequence of the second openreading frame has significant sequence similarity to other bacterialtranscriptional repressors, thus providing evidence that this openreading frame encodes gusR. The predicted repressor protein is 195 aminoacids; the translational start codon, which was determined by N-terminalamino acid sequence analysis on purified gusR protein, is the secondmethionine residue in the open reading frame (SEQ ID No: 2; nucleotide488 in SEQ ID No.:4). The repressor protein appears to have threedomains: a DNA binding domain of approximately 60 amino acids; aglucuronide binding domain of from about 100 to 140 amino acids; and adomain of about 40 amino acids that has a leucine zipper similar toother transcription factors and which may mediate dimerization. Theprecise boundaries of these domains, and whether there are two or threeseparable domains, has not been definitively established, however theminimal sequence necessary for function of the domains is identifiableby the assays described herein.

[0071] In other aspects of this invention, isolated glucuroniderepressor proteins or glucuronide-binding proteins are provided. Inaddition, depending upon the use of the repressor protein, it may bedesirable that such proteins bind a variety of glucuronides or as few asone specific glucuronide. Specificity of binding is achieved by creatinga variant of the glucuronide and testing the variant for the desiredactivity. Variants of the DNA binding domain to create higher or loweraffinity and of the dimerization domain to increase or abolishdimerization potential are also useful within the context of thisinvention.

[0072] Variants of a glucuronide repressor include amino acidsubstitutions, deletions, insertions, and fusion proteins and areconstructed by any of the well known methods in the art (see, generally,Ausubel et al., supra; Sambrook et al., supra). Such methods includesite-directed oligonucleotide mutagenesis, restriction enzyme digestionand removal of bases or insertion of bases, amplification using primerscontaining mismatches or additional nucleotides, and the like. Variantsof a DNA sequence of a glucuronide repressor include the nucleotidechanges necessary to express a repressor protein having amino acidsubstitutions, deletions, insertions, and the like and nucleotidechanges that result from alternative codon usage. For example, if therepressor protein is expressed in a heterologous species, codonoptimization for that species may be desireable.

[0073] In addition to directed mutagenesis in which one or a few aminoacids are altered, variants that have multiple substitutions may begenerated. The substitutions may be scattered throughout the protein orfunctional domain or concentrated in a small region. For example, theoperator-binding domain is mutagenized in the region of likely DNAcontact residues by oligonucleotide-directed mutagenesis in which theoligonucleotide contains a string of dN bases or the region is excisedand replaced by a string of dN bases. Thus, a population of variantswith a randomized amino acid sequence in a region is generated. Thevariant with the desired properties (e.g., higher binding affinity tothe glucuronide operator) is then selected from the population. Insimilar manner, multiple variants of the glucuronide-binding domain aregenerated. These variants are selected for binding to a particularglucuronide, preferably to the exclusion of or with much lower affinityto other glucuronides.

[0074] In other embodiments, the repressor protein comprises a fusionprotein of a glucuronide-binding domain and a sequence-specific DNAbinding domain or a fusion protein of a repressor and a molecule thatbinds the aglycon portion of the glucuronide. Construction of thesefusion proteins is preferably accomplished by amplification of thedesired domain regions and ligation of the amplified products. One ofskill in the art recognizes that other routine methods and proceduresmay be alternatively used.

[0075] The glucuronide repressors will have at least a DNA bindingdomain and a glucuronide binding domain. For most repressors molecules,these domains are distinct sequences, although overlap of sequence ispossible. For example, the dimerization domain of a repressor proteinmay be inseparable from another functional domain. In E. coli, the gusRrepressor has a DNA binding domain comprising approximately the first 60to 65 residues, and the glucuronide binding domain comprisingapproximately residues 60-65 and 160. These domains may be somewhatlarger or smaller and assays for determining the boundaries of thesedomains are provided herein. For action of the repressor,oligonucleotide primer sequences are derived from residues flanking theglucuronide binding domain are synthesized and used to amplify thedomain. Restriction sites are preferably included in the primers tofacilitate ligation and cloning. Similarly, primers flanking a DNAbinding domain. selected from a DNA-binding protein, such as for cro,lac repressor, glucocorticoid receptor, trp repressor, TFIIIA, Sp-1,GCN4, AP-2, GAL4 repressor and any transcription factor, includingactivators and repressors with a known DNA sequence that the factorbinds, are useful within the context of this invention (see, Sauer andPabo, Ann. Rev. Biochem. 61:1053-1095, 1992). Compatible restrictionsites are preferably incorporated into the primers, such that theproducts when joined are in the same reading frame. Amplified productsof the two domains are restricted and ligated together and inserted intoan appropriate vector. Verification of the resulting clone is readilydone by restriction mapping and DNA sequence analysis. DNA sequenceanalysis is preferable so that an in-frame reading frame can beverified.

[0076] In similar manner, a fusion protein of a repressor and an aminoacid sequence that binds the aglycon are constructed. The repressor maybe the glucuronide repressor or a fusion protein as described above. Theamino acid sequence that binds the aglycon includes, but is not limitedto, single chain antibodies, natural substrates or ligands, and thelike. The additional part of the fusion protein is designed to conferincreased specificity of the repressor for the glucuronide.

[0077] Vectors, Host Cells and Means of Expressing and Producing Protein

[0078] The glucuronide repressor may be expressed in a variety of hostorganisms. Preferably, the repressor is produced in bacteria, such as E.coli, for which many expression vectors have been developed and areavailable. Other suitable host organisms include other bacterialspecies, and eukaryotes, such as yeast (e.g., Saccharomyces cerevisiae),mammalian cells (e.g., CHO and COS-7), and insect cells (e.g., Sf9).

[0079] A DNA sequence encoding the repressor is introduced into anexpression vector appropriate for the host. The repressor sequence isderived from an existing cDNA or synthesized. A preferred means ofsynthesis is amplification of the gene from cDNA using a set of primersthat flank the coding region or the desired portion of the protein. Asdiscussed above, the repressor sequence may contain alternative codonsfor each amino acid with multiple codons. The alternative codons can bechosen as “optimal” for the host species. Restriction sites aretypically incorporated into the primer sequences and are chosen withregard to the cloning site of the vector. If necessary, translationalinitiation and termination codons can be engineered into the primersequences.

[0080] At minimum, the vector must contain a promoter sequence. Otherregulatory sequences may be included. Such sequences include atranscription termination signal sequence, secretion signal sequence,origin of replication, selectable marker, and the like. The regulatorysequences are operationally associated with one another to allowtranscription or translation.

[0081] The plasmids used herein for expression of glucuronide repressorinclude a promoter designed for expression of the proteins in abacterial host. Suitable promoters are widely available and are wellknown in the art. Inducible or constitutive promoters are preferred.Such promoters for expression in bacteria include promoters from the T7phage and other phages, such as T3, T5, and SP6, and the trp, lpp, andlac operons. Hybrid promoters (see, U.S. Pat. No. 4,551,433), such astac and trc, may also be used. Promoters for expression in eukaryoticcells include the P10 or polyhedron gene promoter of baculovirus/insectcell expression systems (see, e.g., U.S. Pat. Nos. 5,243,041, 5,242,687,5,266,317, 4,745,051, and 5,169,784), MMTV LTR, RSV LTR, SV40,metallothionein promoter (see, e.g., U.S. Pat. No. 4,870,009) and otherinducible promoters. For expression of the proteins, a promoter isinserted in operative linkage with the coding region for the glucuroniderepressor.

[0082] The promoter controlling transcription of the glucuroniderepressor may itself be controlled by a repressor. In some systems, thepromoter can be derepressed by altering the physiological conditions ofthe cell, for example, by the addition of a molecule that competitivelybinds the repressor, or by altering the temperature of the growth media.Preferred repressor proteins include, but are not limited to the E. colilacI repressor responsive to IPTG induction, the temperature sensitiveλcI857 repressor, and the like. The E. coli lacI repressor is preferred.

[0083] In other preferred embodiments, the vector also includes atranscription terminator sequence. A “transcription terminator region”has either a sequence that provides a signal that terminatestranscription by the polymerase that recognizes the selected promoterand/or a signal sequence for polyadenylation.

[0084] Preferably, the vector is capable of replication in bacterialcells. Thus, the vector preferably contains a bacterial origin ofreplication. Preferred bacterial origins of replication include thef1-ori and col E1 origins of replication, especially the ori derivedfrom pUC plasmids.

[0085] The plasmids also preferably include at least one selectablemarker that is functional in the host. A selectable marker gene includesany gene that confers a phenotype on the host that allows transformedcells to be identified and selectively grown. Suitable selectable markergenes for bacterial hosts include the ampicillin resistance gene(Amp^(r)), tetracycline resistance gene (Tc^(r)) and the kanamycinresistance gene (Kan^(r)). The kanamycin resistance gene is presentlypreferred. Suitable markers for eukaryotes usually require acomplementary deficiency in the host (e.g., thymidine kinase (tk) in tk-hosts). However, drug markers are also available (e.g.. G418 resistanceand hygromycin resistance).

[0086] The sequence of nucleotides encoding the glucuronide repressormay also include a secretion signal, whereby the resulting peptide is aprecursor protein processed and secreted. The resulting processedprotein may be recovered from the periplasmic space or the fermentationmedium. Secretion signals suitable for use are widely available and arewell known in the art (von Heijne, J. Mol. Biol. 184:99-105, 1985).Prokaryotic and eukaryotic secretion signals that are functional in E.coli (or other host) may be employed. The presently preferred secretionsignals include, but are not limited to, those encoded by the followingE. coli genes: pelB (Lei et al., J. Bacteriol. 169:4379, 1987), phoA,ompA, ompT, ompF, ompC, beta-lactamase, and alkaline phosphatase.

[0087] One skilled in the art appreciates that there are a wide varietyof suitable vectors for expression in bacterial cells and which arereadily obtainable. Vectors such as the pET series (Novagen, Madison,Wis.) and the tac and trc series (Pharmacia, Uppsala, Sweden) aresuitable for expression of a glucuronide repressor. Baculovirus vectors,such as pBlueBac (see, e.g., U.S. Pat. Nos. 5,278,050, 5,244,805,5,243,041, 5,242,687, 5,266,317, 4,745,051, and 5,169,784; availablefrom Invitrogen. San Diego) may be used for expression of the repressorin insect cells, such as Spodoptera frugiperda sf9 cells (see, U.S. Pat.No. 4,745,051).

[0088] The choice of a bacterial host for the expression of aglucuronide repressor is dictated in part by the vector. Commerciallyavailable vectors are paired with suitable hosts.

[0089] Repressor protein is isolated by standard methods, such asaffinity chromatography, size exclusion chromatography, ionic exchangechromatography, HPLC, and other known protein isolation methods, (seegenerally Ausubel et al. supra; Sambrook et al. supra). An isolatedpurified protein gives a single band on SDS-PAGE when stained withCoomassie blue.

[0090] Preferably, the repressor protein is expressed as a hexahisfusion protein and isolated by metal-containing chromatography, such asnickel-coupled beads. Briefly, a sequence encoding His₆ is linked to aDNA sequence encoding a repressor. Although the His₆ sequence can bepositioned anywhere in the molecule, preferably it is linked at the 3′end immediately preceding the termination codon. The His-gusR fusion maybe constructed by any of a variety of methods. A convenient method isamplification of the gusR gene using a downstream primer that containsthe codons for His₆ (see Example 3C).

[0091] A repressor protein can also be purified by virtue of its bindingto β-glucuronides that are competitive inhibitors of β-glucuronidase.The glucuronides are coupled to an affinity matrix, such as Separose oragarose, through a carbodiimide-medated crosslinking or other suitablemethod. For example, phenylthio-β-D-glucuronide-Sepharose CL6B andsaccharolactone-agarose (Biosynth AG, Switzerland) both bind gusRprotein and can be eluted from the matrix with an appropriate saltconcentration.

[0092] Assays for Function of Glucuronide Repressor Protein

[0093] Repressor activity is conveniently measured by a variety ofassays, including genetic and biochemical assays. Briefly, a straindeleted for the entire gus operon (e.g., KW1) is transformed by aplasmid containing the operator region and gusABC genes. Alternatively,a stain deleted for the repressor gene sequences may be used. Such astrain constitutively expresses gusA, the activity of which may bereadily detected by a β-glucuronidase substrate, preferably achromogenic substrate (e.g., 5-bromo-4chloro-3-indoxyl-glucuronide) orfluorogenic substrate (e.g., 4-methlumbelliferone-glucuronide). Thisstrain is further transformed with a plasmid that expresses therepressor or candidate repressor protein. If repressor activity ispresent, virtually all glucuronidase activity is eliminated. Repressionis relieved by addition of a suitable glucuronide inducer. Variations ofthis assay, such as the choice of substrate, inducer, strain and vectorconstructs, may be made based on the teachings herein and in the art.Other in vitro assays, such as DNA footprinting in the presence andabsence of a β-glucuronide inducer, may also be used to assay repressoractivity.

[0094] Additional in vitro assays and methods for measuring the bindingof the repressor to DNA and for measuring the binding of a glucuronideto the repressor involve biosensors or chip-based technologies. Withbiosensors, such as the BIA core (Pharmacia Biosensor AB, Uppsala,Sweden) or the apparatus disclosed in U.S. Pat. No. 5,395,587,functional characterization of protein-ligand and protein-DNAinteractions are measured in real time using surface plasmon resonancedetectors. (See, generally, Malmqvist, Nature 361:186, 1993; Coulet andBardeletti, Biochem. Soc. Trans. 19:1, 1991; Robinson, Biochem. Soc.Trans. 19:___, 1991; and Downs, Biochem. Soc. Trans. 19: ___, 1991).Chip-based technology such as described in U.S. Pat. No. 5,412,087; WO95/22058, U.S. application Ser. No. 08/28454, and WO 88/08875, may alsobe exploited for measuring binding.

[0095] As described herein, this invention provides repressor proteinsthat comprise the DNA-binding activity of a glucuronide repressorprotein. The DNA-binding activity is the specific binding to aglucuronide operator sequence. Although a variety of in vivo and invitro assays may be used to assess DNA binding, a genetic assay or abiosensor-based assay may be used. Briefly, in a genetic assay, thenucleotide sequence encoding a candidate binding protein is cloned intoan expression vector. A strain is isolated or constructed that lacks thegusR gene or activity and contains a glucuronide operator sequencelinked to a reporter gene, such that there is constitutive expression ofthe reporter gene. Preferably, a construct, such as pKW222 containingthe operator and gusABC genes, is used, but other suitable and readilyassayable reporter genes (e.g., β-galactosidase, luciferase) may besubstituted for gusA. If the candidate binding protein binds to theoperator, transcription and therefore enzymatic activity of gusA will begreatly diminished or eliminated. Alternatively, a mobility shift assaymay be performed. Briefly, fragments of DNA containing a glucuronideoperator sequence are obtained. Any suitable method for isolating thesefragments may be used. For example, DNA fragments may be isolated afterrestriction digestion of a plasmid or other DNA that contains theoperator sequence or by amplification of the operator region andpurification of the amplified product. The fragments are radiolabeledand mixed with protein (see, Ausubel et al., supra, Chapter 12 forprotocols). Reactions are electrophoresed through agarose orpolyacrylamide gels and exposed to X-ray film. Specific protein-DNAinteractions result in retarded mobility of the DNA fragment. Althoughless preferable, other methods may be used for detectingsequence-specific binding of proteins to DNA, including nitrocellulosefilter binding, DNase I footprinting, methylation protection, andmethylation interference.

[0096] In other aspects of this invention, proteins are provided thathave the β-glucuronide binding activity of the glucuronide repressor.Such activity may be assayed in vitro or in vivo. For example, an invitro assay may be performed by spotting the protein on nitrocelluloseor electrophoresing protein and transferring protein to nitrocelluloseand incubating radiolabeled, fluorescent or chromogenic glucuronide tothe nitrocellulose. Any means of contacting the protein andβ-glucuronide may be used. Furthermore, many β-glucuronide substratesare available that give a fluorescent or chromogenic signal upon bindingor with subsequent cleavage by the addition of GUS. Bound glucuronide isthen detected by autoradiography. Other in vitro assays include thebiosensor-based assays described above. A suitable in vivo assay isperformed by constructing a strain as described above, which containsthe glucuronide operator and gusABC genes. Alternatively. anotheroperator and reporter gene construct can be used as long as the cell canimport the glucuronide. A vector construct capable of expressing arepressor protein having an operator-binding amino acid sequence fusedto the candidate glucuronide-binding amino acid sequence. The test celltransfected with this construct will be repressed for expression of thereporter gene. A glucuronide is provided to the cell and causesderepression of the reporter gene if the repressor binds theglucuronide. By supplying different glucuronides in these assays, apattern of discrimination for glucuronide binding is determined.

[0097] B. Uses of the Repressor to Control Gene Expression in Cells

[0098] As discussed above, this invention provides vectors for theexpression of transgenes under control of a glucuronide repressor.Within the context of this invention, a transgene is any gene sequenceintroduced into plant or animal cells. Two types of glucuroniderepressor controlled systems are provided herein. One is aglucuronide-dependent expression system; the other is a glucuroniderepressed expression system (FIG. 4).

[0099] In the glucuronide-dependent system, a vector is constructedcontaining two expression units. One unit comprises a glucuroniderepressor, preferably gusR, under control of a promoter capable ofexpression in the host cell. The second unit comprises the transgeneunder control of a promoter, but glucuronide operator sites are locatedin between. In a resting state (without glucuronide inducer), therepressor is expressed, binds to the operator site(s) and interfereswith transcription of the transgene. In the induced state, theglucuronide inducer binds to the repressor and causes release of therepressor from the operator site, thus allowing expression of thetransgene (FIG. 4).

[0100] In the glucuronide-repressed expression system, two expressionunits are again provided. One unit comprises a fusion glucuroniderepressor that has a glucuronide operator binding domain, glucuronidebinding domain, and a transcriptional activator domain. The other unitcomprises the transgene downstream of glucuronide operator sites. In theresting state, the fusion repressor binds to the operator and activatestranscription. In the induced state, the fusion repressor binds to theglucuronide inducer and is released from the operator. Without a linkedpromoter, the transgene is not expressed (FIG. 4).

[0101] For each of these systems, one skilled in the art recognizes thatadditional elements, such as polyadenylation signals, splice sites,enhancers, and the like, may be necessary or optimal for expression ofthe repressor and transgene in the host cell. As well, the choice of apromoter for the repressor and for the transgene in theglucuronide-dependent system depends in part upon the host and tissueused for expression. For example, a tissue-specific promoter may bedesirable to further control expression. Furthermore, the expressionunits may be provided in a single vector or in multiple vectors. Aswell, at least one operator sequence is provided, and preferablymultiple operator sites in tandem array are used. Most preferably, from1-10 operator sites are included.

[0102] Transcriptional activators are well known (see, Sauer and Pabo,supra). Certain activators, such as GAL4 and GCN4 have been successfullyused in two-hybrid systems to activate gene expression and theiractivation domains are well characterized.

[0103] As described herein, in addition to β-glucuronides, β-glucuronidederivatives that are bound by a glucuronide repressor, but are notcleaved by β-glucuronidase, or that more readily pass a cell membraneare useful in these systems. Derivatives of glucuronides that aremodified at the C6 position as an ester linkage, amide linkage, or thelike, to be more hydrophobic provide a glucuronide that is more membranepermeant, but still binds to the repressor protein. Derivatives ofglucuronides that are altered at the C1 position (e.g., through an —N—,—C—, or —S— linkage rather than an —O— linkage) are in general notsusceptible to cleavage by β-glucuronidase. One exception is that an —N—linkage is cleavable by E. coli β-glucuronidase, but is not cleavable byhuman β-glucuronidase. As shown herein, phenyl-thio-β-D-glucuronide isbound by a glucuronidase repressor, but is not cleaved byβ-glucuronidase. These types of derivatives are preferred in situationwhere the host cells express endogenous GUS activity. More preferredβ-glucuronide derivatives are doubly modified to be more membranepermeable (i.e., more hydrophobic) and bind glucuronidase repressor butnot cleaved by endogenous β-glucuronidase. One example of this class ofderivatives has a methyl ester at the C6 position and a thio etherlinkage at C1 to the aglycone. Other hydrophobic groups (e.g., ethylester; propyl ester) and other ether linkages (e.g. —C—; —N—) may beinterchanged. Suitable hydrophobic groups and ether linkages are wellknown.

[0104] Transgenes for Expression

[0105] Preferred transgenes for introduction into plants encode proteinsthat affect fertility, including male sterility, female fecundity, andapomixes; plant protection genes, including proteins that conferresistance to diseases, bacteria, fungus, nemotodes, viruses andinsects; genes and proteins that affect developmental processes orconfer new phenotypes, such as genes that control development ofmeristem, timing of flowering, and the such.

[0106] Insect and disease resistance genes are well known. Some of thesegenes are present in the genome of plants and have been geneticallyidentified. Others of these genes have been found in bacteria and areused to confer resistance.

[0107] Particularly well known insect resistance genes are the crystalgenes of Bacillus thuringiensis. The crystal genes are active againstvarious insects, such as lepidopterans, Diptera, and mosquitos. Many ofthese genes have been cloned. For examples, see, GenBank Accession Nos.X96682, X96684; M76442, M90843, M89794, M22472, M37207, D17518, L32019,M97880, L32020, M64478, M11250, M13201, D00117, M73319, X17123, X86902,X06711, X13535, X54939, X54159, X13233, X54160, X56144, X58534, X59797,X75019, X62821, Z46442, U07642, U35780, U43605, U43606, U10985; U.S.Pat. No. 5,317,096, U.S. Pat. No. 5,254,799; U.S. Pat. No. 5,460,963,U.S. Pat. No. 5,308,760, U.S. Pat. No. 5,466,597, U.S. Pat. No.5,2187,091, U.S. Pat. No. 5,382,429, U.S. Pat. No. 5,164,180, U.S. Pat.No. 5,206,166, U.S. Pat. No. 5,407,825, U.S. Pat. No. 4,918,066; PCTApplications WO 95/30753, WO 94/24264; AU 9062083; EP 408403 B1, EP142924 B1, EP 256,553 B1, EP 192,741 B1; JP 62-56932;. Gene sequencesfor these and related proteins may be obtained by standard and routinetechnologies, such as probe hybridization of a B. thuringiensis libraryor amplification (see generally, Sambrook et al., supra, Ausubel et al.supra). The probes and primers may be synthesized based on publiclyavailable sequence information.

[0108] Other resistance genes to Sclerotinia, cyst nematodes, tobaccomosaic virus, flax and crown rust, rice blast, powdery mildew,verticillum wilt, potato beetle, aphids, as well as other infections,are useful within the context of this invention. Examples of suchdisease resistance genes may be isolated from teachings in the followingreferences: isolation of rust disease resistance gene from flax plants(WO 95/29238); isolation of the gene encoding Rps2 protein fromArabidopsis thaliana that confers disease resistance to pathogenscarrying the avrRpt2 avirulence gene (WO 95/28478); isolation of a geneencoding a lectin-like protein of kidney bean confers insect resistance(JP 71-32092); isolation of the Hm1 disease resistance gene to C.carbonum from maize (WO 95/07989); for examples of other resistancegenes, see WO 95/05743; U.S. Pat. No. 5,496,732; U.S. Pat. No.5,349,126; EP 616035; EP 392225; WO 94/18335; JP 43-20631; EP 502719; WO90/11770; U.S. Pat. No. 5,270,200; U.S. Pat. Nos. 5,218,104 and5,306,863). In addition, general methods for identification andisolation of plant disease resistance genes are disclosed (WO 95/28423).Any of these gene sequences suitable for insertion in a vector accordingto the present invention may be obtained by standard recombinanttechnology techniques, such as probe hybridization or amplification.When amplification is performed, restriction sites suitable for cloningare preferably inserted.

[0109] Nucleotide sequences for other transgenes, such as controllingmale fertility, are found in U.S. Pat. No. 5,478,369, referencestherein, and Mariani et al., Nature 347:737, 1990.

[0110] Vectors, Host Cells, and Methods for Transformation

[0111] As noted above, the present invention provides vectors capable ofexpressing transgenes under the control of a glucuronide repressor. Inagricultural applications, the vectors should be functional in plantcells. At times, it may be preferable to have vectors that arefunctional in E. coli (e.g., production of protein) or animal cells.Vectors and procedures for cloning and expression in E. coli and animalcells are discussed above and, for example, in Sambrook et al (supra)and in Ausubel et al. (supra).

[0112] Vectors that are functional in plants are preferably binaryplasmids derived from Agrobacterium plasmids. Such vectors are capableof transforming plant cells. These vectors contain left and right bordersequences that are required for integration into the host (plant)chromosome. At minimum, between these border sequences is the gene to beexpressed under control of a promoter. In preferred embodiments, aselectable marker and a reporter gene are also included. The vector alsopreferably contains a bacterial origin of replication.

[0113] As discussed above, this invention provides the expression of atransgene in plants or animals under control of a glucuronide repressor.The choice of the transgene depends in part upon the desired result. Forexample, when plant resistance is desired, a preferred gene is specificto the disease or insect.

[0114] In certain preferred embodiments, the vector contains a reportergene. The reporter gene should allow ready determination oftransformation and expression. The GUS (β-glucoronidase) gene ispreferred (U.S. Pat. No. 5,268,463). Other reporter genes, such asβ-galactosidase, luciferase, GFP, and the like, are also suitable in thecontext of this invention. Methods and substrates for assayingexpression of each of these genes are well known in the art. Thereporter gene should be under control of a promoter that is functionalin host cells, such as the CaMV 35S promoter in plants.

[0115] The vector should contain a promoter sequence for the glucuroniderepressor gene and in certain embodiments for the transgene as well.Preferably, for expression of a transgene in plants, the promoter is theCaMV 35S promoter.

[0116] Preferably, the vector contains a selectable marker foridentifying transformants. The selectable marker preferably confers agrowth advantage under appropriate conditions. Generally, selectablemarkers are drug resistance genes, such as neomycin phosphotransferase.Other drug resistance genes are known to those in the art and may bereadily substituted. The selectable marker also preferably has a linkedconstitutive or inducible promoter and a termination sequence, includinga polyadenylation signal sequence.

[0117] Additionally, a bacterial origin of replication and a selectablemarker for bacteria are preferably included in the vector. Of thevarious origins (e.g., colEI, fd phage), a colEI origin of replicationis preferred. Most preferred is the origin from the pUC plasmids, whichallow high copy number.

[0118] A general vector suitable for use in the present invention isbased on pBI121 (U.S. Pat. No. 5,432,081) a derivative of pBIN19. Othervectors have been described (U.S. Pat. No. 4,536,475) or may beconstructed based on the guidelines presented herein. The plasmid pBI121contains a left and right border sequence for integration into a planthost chromosome. These border sequences flank two genes. One is akanamycin resistance gene (neomycin phosphotransferase) driven by anopaline synthase promoter and using a nopaline synthase polyadenylationsite. The second is the E. coli GUS gene (reporter gene) under controlof the CaMV 35S promoter and polyadenlyated using a nopaline synthasepolyadenylation site. Either one of the expression units described aboveis additionally inserted or is inserted in place of the CaMV promoterand GUS gene. Plasmid pBI121 also contains a bacterial origin ofreplication and selectable marker.

[0119] Vectors suitable for expression in animal cells are well known inthe art and are generally described in Ausubel et al., supra andSambrook et al., supra. In addition, transformation methods are wellknown and include electroporation, direct injection, CaPO₄-mediatedtransfection and the like.

[0120] Plant Transformation Methods

[0121] Plants may be transformed by any of several methods. For example,plasmid DNA may be introduced by Agrobacterium co-cultivation orbombardment. Other transformation methods include electroporation,CaPO₄-mediated transfection, and the like. Preferably, vector DNA isfirst transfected into Agrobacterium and subsequently introduced intoplant cells. Most preferably, the infection is achieved byco-cultivation. In part, the choice of transformation methods dependsupon the plant to be transformed. For example, monocots generally cannotbe transformed by Agrobacterium. Thus, Agrobacterium transformation byco-cultivation is most appropriate for dicots and for mitotically activetissue. Non-mitotic dicot tissues can be efficiently infected byAgrobacterium when a projectile or bombardment method is utilized.Projectile methods are also generally used for transforming sunflowersand soybean. Bombardment is used when naked DNA, typically Agrobacteriumor pUC-based plasmids, is used for transformation or transientexpression.

[0122] Briefly, co-cultivation is performed by first transformingAgrobacterium by freeze-thawing (Holsters et al., Mol. Gen. Genet. 163:181-187, 1978) or by other suitable methods (see, Ausubel, et al. supra;Sambrook et al., supra). A culture of Agrobacterium containing theplasmid is incubated with leaf disks, protoplasts or meristematic tissueto generate transformed plants (Bevan, Nucl. Acids. Res. 12:8711, 1984).

[0123] Briefly, for microprojectile bombardment, seeds are surfacesterilized in bleach solution and rinsed with distilled water. Seeds arethen imbibed in distilled water, and the cotyledons are broken off toproduce a clean fracture at the plane of the embryonic axis. Explantsare then bisected longitudinally between the primordial leaves andplaced cut surface up on medium with growth regulating hormones,minerals and vitamin additives. Explants are bombarded with 1.8 μmtungsten microprojectiles by a particle acceleration device. Freshlybombarded explants are placed in a suspension of transformedAgrobacterium transferred to medium with the cut surfaces down for 3days with an 18 hr light cycle. Explants are transferred to mediumlacking growth regulators but containing drug for selection and grownfor 2-5 weeks. After 1-2 weeks more without drug selection, leaf samplesfrom green, drug-resistant shoots are grafted to in vitro grownrootstock and transferred to soil.

[0124] Glucuronide inducer is applied to the plants when a change in thestate of expression of the transgene is desired. Any glucuronide that istransported into a cell is useful in the context of this invention. Thevasculuture system of the plant distributes the inducer. The inducerenters cells either by passive diffusion or by the expression of apermease, which is also a transgene. Preferably, the glucuronide is notdegraded by the host cell. Also, preferably, glucuronide is soluble inaqueous solutions. The glucuronide may be applied by spraying the plant,soil, provided in fertilizer, and the like.

[0125] C. Use of the Repressor in Diagnostics

[0126] As simple glycosides, β-glucuronides are extremely important asthe most prominent of the two principal forms in which xenobiotics(compounds that are foreign to the body) and endogenous phenols andaliphatic alcohols are rendered biologically inert (detoxified) andexcreted in the urine and bile of vertebrates (reviewed by Dutton, 1966,1981).

[0127] The principal problem underlying detoxification in vertebrates,is that many compounds within the body, including endogenousbiologically active molecules such as steroid hormones, bio-degradationproducts such as bilirubin, and foreign compounds (xenobiotics) that mayhave been introduced into the body in food or medicine, are lipophilicor fat soluble. Hence, they do not dissolve readily in urine or bile,the two major routes to removal of waste products from the body. Thisproblem is overcome by conjugation of the lipophilic compounds to highlypolar residues, such as glucuronic acid or a sulfate residue, making theresulting conjugate highly water soluble, and thus able to be excretedfrom the body.

[0128] Glucuronidation occurs in many tissues in vertebrates,particularly in the liver. The reaction is carried out by a set ofmembrane-bound enzymes that catalyze the transfer of a glucuronateresidue from uridine diphosphate 1-α-D-glucuronate to the aglycon (theaglycon is the residue being detoxified, to which the sugar molecule orglycon is bound). Several isozymes of UDP-glucuronyl transferase havebeen characterized, and these are reviewed in detail in Dutton (1980).These enzymes frequently form part of a collection of detoxifyingenzymes, including hydroxylases and mixed-function oxidases, that worktogether to metabolize lipophilic, relatively insoluble compounds intothe highly water-soluble glucuronide conjugates (as well as intosulfates and other derivatives). These conjugates are then excreted intothe bile (for the larger glucuronide conjugates) or the urine. (See FIG.5.)

[0129] Several thousand β-glucuronides have been identified in urine andbile as detoxication products. This includes many that form followingoral administration of the free aglycon or a related compound, forexample, as a drug during medical treatment, and an extensive list ofknown glucuronides can be found in Dutton (Glucuronic Acid, Free andCombined, Academic Press, New York 1966). In addition, many endogenoussteroid hormones and bioactive substances, or bio-degradation productssuch as bilirubin, are conjugated and excreted as β-glucuronideconjugates. This process of conjugation with glucuronides is reversed byactivity of the enzyme β-glucuronidase (GUS).

[0130] The ability of GUS to cleave a β-glucuronide conjugate dependsupon two key steps: (1) the substrate must be taken up into the cell,generally mediated via the glucuronide permease, and (2) the substratemust be able to alleviate repression by the gus repressor.

[0131] The ability of a number of different glucuronides to induce GUSactivity varies (e.g., methyl β-D-glucuronide at 1 mM concentrationinducing a level of GUS activity approximately 15 times that of phenylβ-D-thioglucuronide). In addition, 5-bromo-4-chloro-3-indolylβ-D-glucuronide (X-Gluc), p-nitrophenyl β-D-glucuronide (PNPG),4-methylumbelliferyl β-D-glucuronide (MUG) and resorufin glucuronide allact as powerful inducers. In general, values of GUS activity measuredafter 90′ induction, starting with 1 mM external concentrations of theseglucuronides, are of the order of 1-50 nmols PNPG hydrolyzed per minuteper OD₆₀₀ unit of bacterial culture. Glucuronides that occur naturallyin the body, including oestrogen glucuronide and testosteroneglucuronide also have inducing ability (see Example 4 below).

[0132] The ability of the glucuronides to induce GUS, and therefore bindthe repressor, may be used to assay the presence of glucuronides in asample. Typically, for mammals and humans, in particular, the sample ispreferably urine, but may also be bile obtained from the bile duct orlarge intestine, or sera. An assay for detecting glucuronides is asfollows. Briefly, an operator sequence is bound with a glucuroniderepressor. The sample is added, and if a glucuronide that binds to therepressor is present, the repressor is released from the operator. Theunbound repressor is then detected. A glucuronide is present in a sampleif the release of the repressor is higher than the release detected whena sample that does not contain the glucuronide is used.

[0133] The DNA sequence may be a glucuronide operator, but mayalternatively be any sequence that the repressor specifically binds. Forexample, if the repressor is a fusion of a lac repressor DNA bindingsequence and a glucuronide binding domain, the DNA sequence is the lacoperator. Furthermore, the repressor may bind only a single glucuronide.Methods for generating and assaying such repressors are describedherein.

[0134] Although this assay can be performed in solution, preferably theoperator is bound to a solid substrate. Such solid substrates includebeads, chips, biosensors and the like. Specific detection includes anymeans that distinguishes unbound repressor from bound repressor. Suchmeans include colorometric, surface plasmon resonance,chemiluminescence, autoradiography and others known in the art.

[0135] D. Use of the Repressor to Purify Glucuronides

[0136] This invention provides methods to purify glucuronides using thebinding characteristics of a glucuronide repressor. Briefly, aglucuronide repressor or glucuronide binding domain is attached,conjugated, or bound to a substrate. Alternatively, the repressor ordomain is in solution. A sample containing a glucuronide is added forsufficient time to bind to the repressor. Preferably, the sample isadded for a time to achieve equilibrium binding. Unbound material iswashed away, and bound glucuronide is eluted. In general, elution occursunder non-physiological conditions, such as temperature shift, increasedor decreased salt concentration, increased or decreased pH. (See, forexample. Dean et al. Affinity chromatography: a practical approach IRLPress, Oxford, England, 1985.)

[0137] The repressor may be bound to a variety of matrices. Proteins arereadily attached to agarose beads, dextran beads, nitrocellulose,polyacrylamide beads, magnetic beads, and the like. Methods for couplingto these and similar solid substrates are well known and a generaldiscussion is found in Dean et al. (supra). In preferred embodiments,the repressor is isolated as a hexahis fusion protein, which is readilybound to a nickel column. Other fusion protein tags, such as S tag, T7tag, HSV tag, are readily available (Novagen, Madison, Wis.), as well askits containing the materials for binding the fusion protein. Therepressor may alternatively be conjugated with biotin and bound to anavidin or streptavidin-conjugated substrate (e.g., streptavidin-agarosebeads) either before or after contact with the sample.

[0138] When isolation of a specific glucuronide is desired, theglucuronide repressor used for isolation preferably binds thatglucuronide specifically and either does not bind other glucuronides orbinds others with a much lower affinity. A specific binding glucuroniderepressor is either naturally found or is a variant generated by themethods described herein.

[0139] E. Use of the Repressor to Identify a Glucuronide TransportProtein from a Vertebrate

[0140] This invention also provides methods for identifying aglucuronide transport protein from a vertebrate. As discussed above, GUSactivity is found in essentially all vertebrates, implying that aspecific transport protein is present. However, identification andisolation of such a protein has remained elusive. Clones expressing aglucuronide repressor are used to facilitate identification of a cloneexpressing a vertebrate transport protein.

[0141] Briefly, a cell that does not have GUS activity is transformedwith a vector expressing gusR and a reporter or selectable gene linkedto glucuronide operator sequences. In a resting state, the reporter geneis not expressed. When a glucuronide is added, there should be noexpression of the reporter gene, indicating that the cell lacks aglucuronide transport protein. Suitable host cells include yeast andplants, and most bacteria. Transformed cells are then transfected withan expression library from a vertebrate, such as a human expressionlibrary. Such libraries are commercially available or are constructed bystandard methodologies. Doubly transformed cells are treated withβ-glucuronides and the appearance of the reporter or selectable gene isassayed. A selectable gene is preferred and examples of such genesinclude drug resistance genes (e.g., G418 resistance). Cells thattransport the glucuronide express the reporter gene, and the cloneresponsible for transport is isolated and characterized.

[0142] The following examples are offered by way of illustration, andnot by way of limitation.

EXAMPLES Example 1 Cloning of the E. coli Glucuronide Repressor (gusR)

[0143] A chromosomal region of E. coli known to encode gusA, whichencodes β-glucuronidase, (see, U.S. Pat. No. 5,268,463) and gusB, whichencodes glucuronide permease (see, U.S. Pat. Nos. 5,268,463 and5,432,081) is cloned as a Pst I-Hind III fragment from digested E. coligenomic DNA. The fragment is inserted into either a low-copy plasmidvector pRK404 (pKW212) or a high copy plasmid vector, pBSII SK+(pKW214). When a clone containing only the gusA and gusB genes aretransfected into a host cell, high levels of constitutive GUS activityare measured in extracts using the substrate p-nitrophenyl-glucuronide.In contrast, a host cell transfected with either clone containing thePst I-Hind III fragment, which extends several kilobases in the 5′ and3′ direction of gusA and gusB, did not have glucuronidase activity.However, glucuronidase activity is induced by addition of a GUSsubstrate, such as p-nitrophenyl-glucuronide. Thus, the Pst I-Hind IIIfragment contains a gene capable of repressing the transcription of gusAand gusB, and the repression is relieved by the addition of a substrateglucuronide molecule.

[0144] Identification of the repressor gene was facilitated by theconstruction of two subclones of the Pst I-Hind III fragment of pKW212.One subclone contained an EcoR I-Hind III fragment known to comprise thegus promoter and the gusABC genes (pKW222). A second subclone containedan approximately 1.4 kb BstX I-Nco I fragment (nucleotides 1 to 1368 ofSEQ ID NO: 4), which maps downstream of the Pst I site and upstream ofgusA. The fragment was cloned as a blunt-ended fragment into pBSIISK+ tocreate pKW223 (FIG. 3). The repressor is shown to reside on the 1.4 kbBstX I-Nco I fragment by the following transformation experiment. StrainKW1, which is deleted for the entire gus operon region, is transformedwith pKW222. This transformant shows a high level of constitutive GUSactivity. When this transformed strain is further transformed with thecompatible plasmid pKW223, virtually all GUS activity is eliminated,indicating that pKW223 comprises a gene or DNA sequence that repressesthe expression of the gus operon. This repression is reversible byaddition of the inducer molecule X-glcA(5-bromo-4-chloro-3-indolyl-β-D-glucuronide). This is demonstrated bythe production of deep blue colonies when the doubly transformed cellsare plated on the indigogenic substrate X-glcA.

[0145] The DNA sequence of the GUS gene region was determined from theinserts of pKW222 and pKW223 and is presented in SEQ. ID NO: 1. ThegusABC genes were identified as beginning at nucleotide 1466. Two largeopen reading frames 5′ of gusA were found from nucleotides 1-264 and485-1075. The 5′ most reading frame was identified as7-alpha-hydroxysteroid dehydrogenase. The predicted amino acid sequenceof the second open reading frame showed significant sequence similarityto other bacterial transcriptional repressors, thus providing evidencethat this open reading frame codes for gusR. The predicted repressorprotein is approximately 196 amino acids; the precise translationalstart codon is uncertain because there are three methionine residues atthe N-terminal portion of the predicted protein (SEQ ID NO: 2). Therepressor protein appears to have three domains: a DNA binding domain ofapproximately 60 amino acids; a glucuronide binding domain of from about100 to 140 amino acids; and a domain of about 40 amino acids that has aleucine zipper similar to other transcription factors that may mediatedimerization. The precise boundaries of these domains, and whether thereare two or three separable domains, is not definitively established.

Example 2 Identification of the E. coli Glucuronide Operator

[0146] Two approaches lead to identification of the operator sequence ofthe gus operon. In one approach, subclones of the operator region areconstructed and tested for ability to titrate repressor away fromoperator sites on chromosomal DNA. In the second approach, particularsequences of interest within the operator region are synthesized, clonedinto a high copy plasmid, and tested by repressor/operator titrationexperiments. (See FIGS. 8 and 9.)

[0147] (1) A 1.4 kb BamHI-BamHI fragment containing the entireintergenic region between gusA (the first gene of the gus operon) andthe upstream gene gusR was isolated and cloned into the vectorpBSII(SK+) to create pKW244 (FIGS. 6 and 7). The BamHI fragmentencompasses the main operator sites regulating the gus operon. Initialexperiments confirmed that the insert of pKW244 does contain repressorbinding sites. E. coli strain DH5α transformed with pKW244 yields bluecolonies on plates containing X-gluc, indicating induction of the gusoperon by repressor titration.

[0148] Subclones of the regulatory region were constructed (FIGS.10-15). The β-glucuronidase activity of these clones is presented in thefollowing Table and FIG. 15. Average amount of 95% % of pKW244β-glucuronidase production confidence β-glucuronidase Plasmid (nmolpNP/min/mg protein) limit production pKW244 943 154 100 pBSIISK+ 1.260.6 0.1 pMEL1 22.04 9.1 2.3 pMEL3 926.5 486 98.2 pMEL4 198.14 35.6 21pMEL5 254.9 31.2 27 pMEL8 1.16 0.4 0.1

[0149] These results show that pMEL3, pMEL4, and pMEL5 contain operatorsequences and thus, the operator region was narrowed.

[0150] A second approach that identifies operator sites of the gusoperon is performed by synthesizing and cloning putative operatorsequences directly into a pBSIISK+ vector and testing the clones forrepressor binding by titration (FIG. 8 Three putative operatorsequences, consisting of palindromic sequences, were identified from DNAsequence analysis.

[0151] One potential operator sequence is a 14 bp imperfect palindromecentered around an Hpa I site at +15 from the gus operon putativetranscriptional start. A second, highly homologous (13 out of 14 basepairs) Hpa I palindrome is also present near the transcriptional startof the gusR gene. As the majority of repressors, including gusR areknown to regulate themselves it was expected that a GusR operator sitealso exists.

[0152] Both HpaI-centered sequences were cloned into pBSIISK+ (FIGS. 16and 17). Two complementary oligonucleotides were synthesized andannealed. The double-stranded oligonucleotides had EcoRI and BamHIsticky ends, which were cloned into pBSIISK+ vector which had beenprepared by digestion with EcoRI and BamHI. Clones containing theseinserts were identified by titration of GUS activity in DH5αtransformants plated on X-gluc plates and by the incorporation of theHpaI site in the resulting plasmid.

[0153] Operator/repressor titration experiments performed on the variousgus operon subclones discussed above suggested that a second region ofDNA, separate from the HpaI palindrome discussed above, binds repressormolecules. This 75 bp region contains a 40 bp sequence containing twooverlapping palindromes. (FIG. 18). A clone containing this regionresulted in approximately 20% induction of the gus operon, indicatingthat it was sufficient to account for all repressor binding observedwith pMEL4 and pMEL5 transformed DH5α. This further narrows down thepositioning of a repressor binding sequence upstream of thetranscriptional start to this particular fragment of DNA. Furtheranalysis using a strain deleted for the uxu operon (ER1648; New EnglandBiolabs, Beverly, Mass.) demonstrated that the uxu repressor accountsfor less than 5% of gus operon repression.

[0154] This palindromic region was cloned into pBSIISK− vector by,complementary oligonucleotides which when annealed create EcoRI andBamHI sticky ends. Clones (pMEL6) were screened for by the titration ofGUS activity in DH5α transformants plated on X-gluc plates. Candidateclones were verified by restriction digestion with Psp1406I. Inaddition, a perfect palindrome centered around the Psp1406I site wascloned into pBSIISK+ (pMEL7) to test for stronger repressor binding. Dueto the nature of a perfect palindrome, only one oligonucleotide wassynthesized, which created BamHI overhangs (see FIG. 18). Resultantclones were selected for by the loss of the α-complementation phenotypeof the pBSIISK+ vector in DH5α transformants plated on Magenta-Gal (100μg/ml) and verified by digestion with Psp1406I. This clone, pMEL7,resulted in very little titration when transformed into DH5α. The lossof repressor binding ability would seem to indicate that the trueoperator site within this region is the second palindrome, centered at−164 from the gus operon transcriptional start. However, in creatingthis 18 bp perfect palindrome, it is possible that nucleotides importantto repressor binding to this region may have been replaced, therebyreducing the overall affinity of this site for a repressor molecule.

[0155] Identification of ER1648 as an uxuR deletion strain allowedoperator/repressor titration experiments performed with the gus operonregulatory sub-regions to be performed in a strain lacking the UxuRrepressor. Any significant differences observed between these twosystems could then be attributed to the absence of an UxuR titrationeffect. A number of the various gus operon regulatory region subcloneswere transformed into this strain. β-glucuronidase production wasmeasured by the spectrophotometric GUS assay. Results of these titrationexperiments are recorded in the Table below and shown schematically inFIG. 19. Average amount of 95% % of pKW244 β-glucuronidase productionconfidence β-glucuronidase Plasmid (nmol pNP/min/mg protein) limitproduction pKW244 709 145 100  pBSIISK+ 35.2 5.2   5.0 pMEL1 9.7 3.5  1.4 pMEL3 583 131   82.2 pMEL4 719 561 >100% pMEL5 753 219 >100% pMEL6819 123 >100% pMEL34 43.3 16.3   6.1

[0156] When pMEL1 and pMEL34 were transformed into ER1648, nosignificant increase from the background beta-glucuronidase activity wasdetectable, suggesting that these plasmids were not titrating repressoraway from the gus operon in this strain. As these plasmids contain theHpaI palindromic sequence shown to titrate repressor when transformedinto DH5α, this indicates that the HpaI palindrome is an UxuR bindingsite.

[0157] In contrast pMEL4, pMEL5 and pMEL6, all containing the majorregion of repressor binding regulating the gus operon showed a 5-foldincrease in titration effect when transformed into this uxuR deletionstrain, equaling that produced by pKW244 transformants.

[0158] Therefore, repressor/operator titration experiments performedwith various sub-clones of the gus operon regulatory region haveresulted in the identification of two repressor binding regionsregulating the gus operon. A major binding region is located on a 44 bpfragment situated between −136 and −180 bp upstream of the gusAtranscriptional start site, while a second, minor binding site is foundin the HpaI centered imperfect palindrome located at +25 from this samestart of transcription. This second binding site is an UxuR operatorsite.

Example 3 Expression of Gus Repressor Protein

[0159] Overexpression of gusR gene product is achieved by cloning thecoding region in an expression vector. gusR gene is cloned into avariety of expression vectors by subcloning the gene from pKW223 and byamplification.

[0160] A. Expression of gusR as a lacZ Fusion Protein

[0161] The gusR gene was initially cloned in a 5′-3′ transcriptionalorientation downstream of the lac promoter in pBSIISK+ (pKW224) (FIG.20). The fragment containing gusR had an additional 490 bp of upstreamand 305 bp of downstream sequence. However, no GusR protein was detectedwhen this plasmid was introduced into E. coli, suggesting that asequence was hampering the expression of the gusR gene from the lacZpromoter. An inspection of the upstream sequence revealed an openreading frame found to contain the C-terminal coding region and thetranscriptional terminator of the hsdH gene, involved in E. coli steroidmetabolism (Yoshimoto et al., 1991). These sequences likely halted mRNAelongation and translation from the lacZ promoter prior to the gusRgene, located further downstream.

[0162] The hsd terminator was subsequently removed in the followingmanner. pKW224 was digested with Spe I, which cuts in the polylinker and40 bp upstream of the putative translational start of gusR, releasing a468 bp fragment containing the hsd terminator, leaving a 3866 bpfragment containing vector sequences and the gusR gene sequence in thesame orientation as the lac promoter. Following ligation, clones lackingthe 468 bp fragment were identified by amplification of a 1500 bpproduct using the T7 and reverse sequencing primers. Candidate cloneswere verified as lacking the Spe I site. One isolate was named pMEL101(FIG. 21).

[0163] pMEL101 was transformed into E. coli strain KW1 (deleted for thegus operon) and induced for expression by 0.5 mM IPTG. A protein ofabout 26 kDa was clearly detected in pMEL101 transformed KW1, but wasnot detected in protein extracts from wild-type KW1,pBSIISK−-transformed KW1, or pKW224-transformed KW1 (FIG. 22). A 26 kDaprotein is the predicted mass of a fusion protein formed between the 22kDa GusR protein and the lacZ coding sequence upstream of this gene inpMEL101.

[0164] GusR was also amplified with the primer pair:5′-CGAGAATTCGAGGAGTCCATCATGATGGATAACATGCAGACTGAA G-3′5′-GCTGAATTCAAGCTTCAGGATGCGGTTAAGATACCGCC-3′

[0165] The 5′ primer (upper primer) contains an EcoRI site and a strongShine-Dalgarno sequence. The 3′ primer (lower primer) contains an EcoRIsite. The amplified product was digested with EcoRI and inserted into avector either to give as a lacZ fusion or a non-fusion protein. FIG. 24shows that the predicted 22 kDa (non-fusion) and 26 kDa (fusion)proteins were produced.

[0166] B. Expression of GusR as a Non-Fusion Protein in pMEL101Derivative

[0167] pMEL101 was engineered to create a frameshift in the fusionprotein leading to the creation of two stop codons in frame with lacZand just upstream of the gusR gene. The translational stop codons wouldforce the detachment of ribosomes from the mRNA transcript at this siteand their reattachment at the nearby gusR start of translation. As such,the expression of wild-type GusR protein would ensue. pMEL103 wasconstructed by digestion of pMEL101 with Sac I, a site located in thepolylinker, removal of the sticky ends by digestion with T4 DNApolymerase. The treated plasmid was religated, transformed into KW1, anda clone with the desired configuration was isolated (pMEL103) (FIG. 23).SDS-PAGE analysis of protein extracts of pMEL103-transformed KW1 showedthe overexpression of a 22 kDa GusR protein. However, genetic testsshowed that despite the expression of GusR, no large decrease was seenin GUS activity, which was expected after induction with IPTG. Anexamination of the DNA sequence downstream of the frameshift identifieda second E. coli start codon (GTG) 12 codons upstream of the gusRtranslational start. Ribosomal reattachment may therefore be occurringpreferentially at this site, rather than at the gusR start oftranslation, to produce an inactive fusion protein. This is likelyconsidering the lack of a strong Shine-Dalgarno sequence regulating thegusR gene.

[0168] C. Expression of GusR as a Hexa-His Fusion Protein

[0169] The coding region of gusR is amplified and inserted into anexpression vector. The vector is a derivative of pTTQ18 (Stark, Gene51:255, 1987) in which an NcoI site was engineered downstream of astrong Shine-Dalgarno sequence, and an NheI site adjacent to six Hiscodons was also engineered. The primers used in the amplificationreaction are as follows: gusR-0528T5′-GACCAGGTTACCATGGATAACATGCAGACTGAAGCAC-3′ gusR-0528B5′-GACGTGATGGTGGCTAGCGGATGCGGTTAAGATACCGCCAATC-3′

[0170] The resulting amplified product (and the native product) uses thesecond methionine as a translational start and contains an Nco I site(underlined in 0528T) at the 5′ end to facilitate cloning, as well as anNheI site at the 3′ end (underlined in 0528B) such that the product isinserted in-frame with vector sequence encoding 6 His residues at theC-terminal end. The nucleotides identical or complementary to gusR arein bold. gusR is amplified from pMEL101, and inserted into a vector.Protein is produced and isolated by nickel-chromatography.

[0171] D. Purification of Glucuronidase Repressor Protein

[0172] Suitable bacterial hosts (e.g., E. coli JM105; XL-1 Blue) aretransformed with a vector construct that is capable of expressing aglucuronidase repressor. Preferred vectors allow induction of expressionupon addition of ITPG. Some suitable vectors are described above, othersare well known and readily available. Following induction and a suitablegrowth period, the cells are harvested and lysed by agitation with glassbeads. The lysate is clarified by centrifugation and batch absorbed on aglucuronide-chromatography matrix, phenylthio-β-D-glucuronide(PTG)-Sepharose CL6B or saccharolactone-agarose for gusR, orNi-IDA-Sepharose for His₆-gusR fusion. The columns are either procuredcommercially or synthesized by linkage using carbodiimide chemistry. Thematrix is poured into a column and washed with buffer, typically either50 mM Tris pH 7.6, 1 mM DTT; 50 mM MES pH 7.0, or IMAC buffer (forhexa-his fusions). The repressor bound to the matrix is eluted in NaClcontaining buffer.

[0173] As shown in FIGS. 25, 26, and 27, purified repressor protein isreadily obtainable by these methods, gusR is substantially eluted fromsaccharolactone-agarose in 0.1 M NaCl and also in 0.5 M NaCl (FIG. 26)and is substantially eluted from PTG-Sepharose at 0.3 M NaCl (FIG. 25).HexaHisgusR is eluted from Ni-IDA-Sepharose in 10 mM EDTA (FIG. 27).

Example 4 Induction of GUS by β-Glucuronides in Wild-Type E. coli

[0174] Various β-glucuronides are tested for their ability to induce GUSactivity. These inducers include steroid glucuronides. Wild-type E. coliis isolated from feces and grown to mid-log phase. Inducer is added at 1mM for 60 min. The cells are washed and GUS activity determined. Thefollowing table indicates that natural β-glucuronides found invertebrates induce the gus operon. Moreover, there is no correlationbetween the molecular weight of the inducer and its inducing ability.INDUCER Mol. Wt. INDUCTION (%) None — <0.5 phenyl glucuronide 270 100o-aminophenyl glucuronide 285 95 p-nitrophenyl glucuronide 315 684-methylumbelliferyl glucuronide 352 89 3-cyanoumbelliferyl glucuronide338 84 tryptophyl glucuronide 380 85 5-bromo-4-chloro-3-indolylglucuronide 521 99 hydroxyquinoline glucuronide 321 21 naphthol ASBIglucuronide 548 12 phenolphthalein glucuronide 493 13estriol-3-glucuronide 464 13 estriol-17-glucuronide 464 11estrone-17-glucuronide 464 13 testosterone-glucuronide 464 12pregnanediol-glucuronide 497 11

[0175] A biological indicator for detecting the presence andconcentration of glucuronides in a sample, such as urine, blood, bile,cell extracts, and the like, can be constructed. Briefly, the gusA genein any of the vector constructs expressing gusA under control of theglucuronidase promoter/operator region is replaced with the codingregion of another reporter gene. Suitable reporter genes are well known,their sequences available or clones containing the genes available.These reporter genes include, β-gal, luciferase, green fluorescentprotein and the like. The engineered construct, which has a syntheticoperon, is introduced into a host cell, such as bacteria, plant cell,animal cell, fungal cell, or any cell line. Preferably, the host celllacks endogenous GUS activity and expresses a glucuronide transportmolecule or is able to transport the glucuronide across a cell membrane.The synthetic operon is thus induced by a glucuronide but the inducedgene does not cleave a glucuronide.

[0176] From the foregoing, it will be appreciated that, althoughspecific embodiments of the invention have been described herein forpurposes of illustration, various modifications may be made withoutdeviating from the spirit and scope of the invention. Accordingly, theinvention is not limited except as by the appended claims.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING

[0177] SEQ ID No. 1 is a nucleotide sequence which encodes a glucuroniderepressor.

[0178] SEQ ID No. 2 is a predicted amino acid sequence of E. coli gusR.

[0179] SEQ ID No. 3 is a nucleotide sequence of the intergenic regionbetween gusR and gusA that contains promoter/operator sequence.

[0180] SEQ ID No. 4 is a nucleotide sequence of the gus operon.

[0181] SEQ ID No. 5 is the predicted amino acid sequence of E. coligusA.

[0182] SEQ ID No. 6 is a predicted amino acid sequence of E. coli gusB.

[0183] SEQ ID No. 7 is a predicted amino acid sequence of E. coli gusC.

1 19 585 base pairs nucleic acid single linear 1 ATGGATAACA TGCAGACTGAAGCACAACCG ACACGGACCC GGATCCTCAA TGCTGCCAGA 60 GAGATTTTTT CAGAAAATGGATTTCACAGT GCCTCGATGA AAGCCATCTG TAAATCTTG 120 GCCATTAGTC CCGGGACGCTCTATCACCAT TTCATCTCCA AAGAAGCCTT GATTCAGGC 180 ATTATCTTAC AGGACCAGGAGAGGGCGCTG GCCCGTTTCC GGGAACCGAT TGAAGGGAT 240 CATTTCGTTG ACTATATGGTCGAGTCCATT GTCTCTCTCA CCCATGAAGC CTTTGGACA 300 CGGGCGCTGG TGGTTGAAATTATGGCGGAA GGGATGCGTA ACCCACAGGT CGCCGCCAT 360 CTTAAAAATA AGCATATGACGATCACGGAA TTTGTTGCCC AGCGGATGCG TGATGCCCA 420 CAAAAAGGCG AGATAAGCCCAGACATCAAC ACGGCAATGA CTTCACGTTT ACTGCTGGA 480 CTGACCTACG GTGTACTGGCCGATATCGAA GCGGAAGACC TGGCGCGTGA AGCGTCGTT 540 GCTCAGGGAT TACGCGCGATGATTGGCGGT ATCTTAACCG CATCC 585 195 amino acids amino acid <Unknown>linear 2 Met Asp Asn Met Gln Thr Glu Ala Gln Pro Thr Arg Thr Arg Ile Le1 5 10 15 Asn Ala Ala Arg Glu Ile Phe Ser Glu Asn Gly Phe His Ser Ala Se20 25 30 Met Lys Ala Ile Cys Lys Ser Cys Ala Ile Ser Pro Gly Thr Leu Ty35 40 45 His His Phe Ile Ser Lys Glu Ala Leu Ile Gln Ala Ile Ile Leu Gl50 55 60 Asp Gln Glu Arg Ala Leu Ala Arg Phe Arg Glu Pro Ile Glu Gly Il65 70 75 80 His Phe Val Asp Tyr Met Val Glu Ser Ile Val Ser Leu Thr HisGl 85 90 95 Ala Phe Gly Gln Arg Ala Leu Val Val Glu Ile Met Ala Glu GlyMe 100 105 110 Arg Asn Pro Gln Val Ala Ala Met Leu Lys Asn Lys His MetThr Il 115 120 125 Thr Glu Phe Val Ala Gln Arg Met Arg Asp Ala Gln GlnLys Gly Gl 130 135 140 Ile Ser Pro Asp Ile Asn Thr Ala Met Thr Ser ArgLeu Leu Leu As 145 150 155 160 Leu Thr Tyr Gly Val Leu Ala Asp Ile GluAla Glu Asp Leu Ala Ar 165 170 175 Glu Ala Ser Phe Ala Gln Gly Leu ArgAla Met Ile Gly Gly Ile Le 180 185 190 Thr Ala Ser 195 390 base pairsnucleic acid single linear 3 TTCTCTCTCT TTTTCGGCGG GCTGGTGATA ACTGTGCCCGCGTTTCATAT CGTAATTTCT 60 CTGTGCAAAA ATTATCCTTC CCGGCTTCGG AGAATTCCCCCCAAAATATT CACTGTAGC 120 ATATGTCATG AGAGTTTATC GTTCCCAATA CGCTCGAACGAACGTTCGGT TGCTTATTT 180 ATGGCTTCTG TCAACGCTGT TTTAAAGATT AATGCGATCTATATCACGCT GTGGGTATT 240 CAGTTTTTGG TTTTTTGATC GCGGTGTCAG TTCTTTTTATTTCCATTTCT CTTCCATGG 300 TTTCTCACAG ATAACTGTGT GCAACACAGA ATTGGTTAACTAATCAGATT AAAGGTTGA 360 CAGTATTATT ATCTTAATGA GGAGTCCCTT 390 7742 basepairs nucleic acid single linear 4 CTGGTCAGAA ATATGGCGTT TGACCTGGGTGAAAAAAATA TTCGGGTAAA TGGCATTGCG 60 CCGGGGGCAA TATTAACCGA TGCCCTGAAATCCGTTATTA CACCAGAAAT TGAACAAAA 120 ATGTTACAGC ACACGCCGAT CAGACGTCTGGGCCAACCGC AAGATATTGC TAACGCAGC 180 CTGTTCCTTT GCTCGCCTGC TGCGAGCTGGGTAAGCGGAC AAATTCTCAC CGTCTCCGG 240 GGTGGGGTAC AGGAGCTCAA TTAATACACTAACGGACCGG TAAACAACCG TGCGTGTTG 300 TTACCGGGAT AAACTCATCA ACGTCTCTGCTAAATAACTG GCAGCCAAAT CACGGCTAT 360 GGTTAACCAA TTTCAGAGTG AAAAGTATACGAATAGAGTG TGCCTTCGCA CTATTCAAC 420 GCAATGATAG GCGCTCACCT GACAACGCGGTAAACTAGTT ATTCACGCTA ACTATAATG 480 TTTAATGATG GATAACATGC AGACTGAAGCACAACCGACA CGGACCCGGA TCCTCAATG 540 TGCCAGAGAG ATTTTTTCAG AAAATGGATTTCACAGTGCC TCGATGAAAG CCATCTGTA 600 ATCTTGCGCC ATTAGTCCCG GGACGCTCTATCACCATTTC ATCTCCAAAG AAGCCTTGA 660 TCAGGCGATT ATCTTACAGG ACCAGGAGAGGGCGCTGGCC CGTTTCCGGG AACCGATTG 720 AGGGATTCAT TTCGTTGACT ATATGGTCGAGTCCATTGTC TCTCTCACCC ATGAAGCCT 780 TGGACAACGG GCGCTGGTGG TTGAAATTATGGCGGAAGGG ATGCGTAACC CACAGGTCG 840 CGCCATGCTT AAAAATAAGC ATATGACGATCACGGAATTT GTTGCCCAGC GGATGCGTG 900 TGCCCAGCAA AAAGGCGAGA TAAGCCCAGACATCAACACG GCAATGACTT CACGTTTAC 960 GCTGGATCTG ACCTACGGTG TACTGGCCGATATCGAAGCG GAAGACCTGG CGCGTGAA 1020 GTCGTTTGCT CAGGGATTAC GCGCGATGATTGGCGGTATC TTAACCGCAT CCTGATTC 1080 TCTCTTTTTC GGCGGGCTGG TGATAACTGTGCCCGCGTTT CATATCGTAA TTTCTCTG 1140 CAAAAATTAT CCTTCCCGGC TTCGGAGAATTCCCCCCAAA ATATTCACTG TAGCCATA 1200 TCATGAGAGT TTATCGTTCC CAATACGCTCGAACGAACGT TCGGTTGCTT ATTTTATG 1260 TTCTGTCAAC GCTGTTTTAA AGATTAATGCGATCTATATC ACGCTGTGGG TATTGCAG 1320 TTTGGTTTTT TGATCGCGGT GTCAGTTCTTTTTATTTCCA TTTCTCTTCC ATGGGTTT 1380 CACAGATAAC TGTGTGCAAC ACAGAATTGGTTAACTAATC AGATTAAAGG TTGACCAG 1440 TTATTATCTT AATGAGGAGT CCCTTATGTTACGTCCTGTA GAAACCCCAA CCCGTGAA 1500 CAAAAAACTC GACGGCCTGT GGGCATTCAGTCTGGATCGC GAAAACTGTG GAATTGAT 1560 GCGTTGGTGG GAAAGCGCGT TACAAGAAAGCCGGGCAATT GCTGTGCCAG GCAGTTTT 1620 CGATCAGTTC GCCGATGCAG ATATTCGTAATTATGCGGGC AACGTCTGGT ATCAGCGC 1680 AGTCTTTATA CCGAAAGGTT GGGCAGGCCAGCGTATCGTG CTGCGTTTCG ATGCGGTC 1740 TCATTACGGC AAAGTGTGGG TCAATAATCAGGAAGTGATG GAGCATCAGG GCGGCTAT 1800 GCCATTTGAA GCCGATGTCA CGCCGTATGTTATTGCCGGG AAAAGTGTAC GTATCACC 1860 TTGTGTGAAC AACGAACTGA ACTGGCAGACTATCCCGCCG GGAATGGTGA TTACCGAC 1920 AAACGGCAAG AAAAAGCAGT CTTACTTCCATGATTTCTTT AACTATGCCG GGATCCAT 1980 CAGCGTAATG CTCTACACCA CGCCGAACACCTGGGTGGAC GATATCACCG TGGTGACG 2040 TGTCGCGCAA GACTGTAACC ACGCGTCTGTTGACTGGCAG GTGGTGGCCA ATGGTGAT 2100 CAGCGTTGAA CTGCGTGATG CGGATCAACAGGTGGTTGCA ACTGGACAAG GCACTAGC 2160 GACTTTGCAA GTGGTGAATC CGCACCTCTGGCAACCGGGT GAAGGTTATC TCTATGAA 2220 GTGCGTCACA GCCAAAAGCC AGACAGAGTGTGATATCTAC CCGCTTCGCG TCGGCATC 2280 GTCAGTGGCA GTGAAGGGCG AACAGTTCCTGATTAACCAC AAACCGTTCT ACTTTACT 2340 CTTTGGTCGT CATGAAGATG CGGACTTACGTGGCAAAGGA TTCGATAACG TGCTGATG 2400 GCACGACCAC GCATTAATGG ACTGGATTGGGGCCAACTCC TACCGTACCT CGCATTAC 2460 TTACGCTGAA GAGATGCTCG ACTGGGCAGATGAACATGGC ATCGTGGTGA TTGATGAA 2520 TGCTGCTGTC GGCTTTAACC TCTCTTTAGGCATTGGTTTC GAAGCGGGCA ACAAGCCG 2580 AGAACTGTAC AGCGAAGAGG CAGTCAACGGGGAAACTCAG CAAGCGCACT TACAGGCG 2640 TAAAGAGCTG ATAGCGCGTG ACAAAAACCACCCAAGCGTG GTGATGTGGA GTATTGCC 2700 CGAACCGGAT ACCCGTCCGC AAGTGCACGGGAATATTTCG CCACTGGCGG AAGCAACG 2760 TAAACTCGAC CCGACGCGTC CGATCACCTGCGTCAATGTA ATGTTCTGCG ACGCTCAC 2820 CGATACCATC AGCGATCTCT TTGATGTGCTGTGCCTGAAC CGTTATTACG GATGGTAT 2880 CCAAAGCGGC GATTTGGAAA CGGCAGAGAAGGTACTGGAA AAAGAACTTC TGGCCTGG 2940 GGAGAAACTG CATCAGCCGA TTATCATCACCGAATACGGC GTGGATACGT TAGCCGGG 3000 GCACTCAATG TACACCGACA TGTGGAGTGAAGAGTATCAG TGTGCATGGC TGGATATG 3060 TCACCGCGTC TTTGATCGCG TCAGCGCCGTCGTCGGTGAA CAGGTATGGA ATTTCGCC 3120 TTTTGCGACC TCGCAAGGCA TATTGCGCGTTGGCGGTAAC AAGAAAGGGA TCTTCACT 3180 CGACCGCAAA CCGAAGTCGG CGGCTTTTCTGCTGCAAAAA CGCTGGACTG GCATGAAC 3240 CGGTGAAAAA CCGCAGCAGG GAGGCAAACAATGAATCAAC AACTCTCCTG GCGCACCA 3300 GTCGGCTACA GCCTCGGTGA CGTCGCCAATAACTTCGCCT TCGCAATGGG GGCGCTCT 3360 CTGTTGAGTT ACTACACCGA CGTCGCTGGCGTCGGTGCCG CTGCGGCGGG CACCATGC 3420 TTACTGGTGC GGGTATTCGA TGCCTTCGCCGACGTCTTTG CCGGACGAGT GGTGGACA 3480 GTGAATACCC GCTGGGGAAA ATTCCGCCCGTTTTTACTCT TCGGTACTGC GCCGTTAA 3540 ATCTTCAGCG TGCTGGTATT CTGGGTGCCGACCGACTGGA GCCATGGTAG CAAAGTGG 3600 TATGCATATT TGACCTACAT GGGCCTCGGGCTTTGCTACA GCCTGGTGAA TATTCCTT 3660 GGTTCACTTG CTACCGCGAT GACCCAACAACCACAATCCC GCGCCCGTCT GGGCGCGG 3720 CGTGGGATTG CCGCTTCATT GACCTTTGTCTGCCTGGCAT TTCTGATAGG ACCGAGCA 3780 AAGAACTCCA GCCCGGAAGA GATGGTGTCGGTATACCATT TCTGGACAAT TGTGCTGG 3840 ATTGCCGGAA TGGTGCTTTA CTTCATCTGCTTCAAATCGA CGCGTGAGAA TGTGGTAC 3900 ATCGTTGCGC AGCCGTCATT GAATATCAGTCTGCAAACCC TGAAACGGAA TCGCCCGC 3960 TTTATGTTGT GCATCGGTGC GCTGTGTGTGCTGATTTCGA CCTTTGCGGT CAGCGCCT 4020 TCGTTGTTCT ACGTGCGCTA TGTGTTAAATGATACCGGGC TGTTCACTGT GCTGGTAC 4080 GTGCAAAACC TGGTTGGTAC TGTGGCATCGGCACCGCTGG TGCCGGGGAT GGTCGCGA 4140 ATCGGTAAAA AGAATACCTT CCTGATTGGCGCTTTGCTGG GAACCTGCGG TTATCTGC 4200 TTCTTCTGGG TTTCCGTCTG GTCACTGCCGGTGGCGTTGG TTGCGTTGGC CATCGCTT 4260 ATTGGTCAGG GCGTTACCAT GACCGTGATGTGGGCGCTGG AAGCTGATAC CGTAGAAT 4320 GGTGAATACC TGACCGGCGT GCGAATTGAAGGGCTCACCT ATTCACTATT CTCATTTA 4380 CGTAAATGCG GTCAGGCAAT CGGAGGTTCAATTCCTGCCT TTATTTTGGG GTTAAGCG 4440 TATATCGCCA ATCAGGTGCA AACGCCGGAAGTTATTATGG GCATCCGCAC ATCAATTG 4500 TTAGTACCTT GCGGATTTAT GCTACTGGCATTCGTTATTA TCTGGTTTTA TCCGCTCA 4560 GATAAAAAAT TCAAAGAAAT CGTGGTTGAAATTGATAATC GTAAAAAAGT GCAGCAGC 4620 TTAATCAGCG ATATCACTAA TTAATATTCAATAAAAATAA TCAGAACATC AAAGGTGC 4680 CTATGAGAAA AATAGTGGCC ATGGCCGTTATTTGCCTGAC GGCTGCCTCT GGCCTTAC 4740 CTGCTTATGC GGCGCAACTG GCTGACGATGAAGCGGGACT ACGCATCAGA CTGAAAAA 4800 AATTGCGCAG GGCGGATAAG CCCAGTGCTGGCGCGGGAAG AGATATTTAC GCATGGGT 4860 AGGGAGGATT GCTCGATTTC AATAGTGGTTATTATTCCAA TATTATTGGC GTTGAAGG 4920 GGGCGTATTA TGTTTATAAA TTAGGTGCTCGTGCTGATAT GAGTACCCGG TGGTATCT 4980 ATGGTGATAA AAGTTTTGCT TTGCCCGGGGCAGTAAAAAT AAAACCCAGT GAAAATAG 5040 TGCTTAAATT AGGTCGCTTC GGGACGGATTATAGTTATGG TAGCTTACCT TATCGTAT 5100 CGTTAATGGC TGGCAGTTCG CAACGTACATTACCGACAGT TTCTGAAGGA GCATTAGG 5160 ATTGGGCTTT AACACCAAAT ATTGATCTGTGGGGAATGTG GCGTTCACGA GTATTTTT 5220 GGACTGATTC AACAACCGGT ATTCGTGATGAAGGGGTGTA TAACAGCCAG ACGGGAAA 5280 ACGATAAACA TCGCGCACGT TCTTTTTTAGCCGCCAGTTG GCATGATGAT ACCAGTCG 5340 ATTCTCTGGG GGCATCGGTA CAGAAAGATGTTTCCAATCA GATACAAAGT ATTCTCGA 5400 AAAGCATACC GCTCGACCCG AATTATACGTTGAAAGGGGA GTTGCTCGGC TTTTACGC 5460 AGCTCGAAGG TTTAAGTCGT AATACCAGCCAGCCCAATGA AACGGCGTTG GTTAGTGG 5520 AATTGACCTG GAATGCGCCG TGGGGAAGTGTATTTGGCAG TGGTGGTTAT TTGCGCCA 5580 CAATGAATGG TGCCGTGGTG GATACCGACATTGGCTATCC CTTTTCATTA AGTCTTGA 5640 GTAACCGTGA AGGAATGCAG TCCTGGCAATTGGGCGTCAA CTATCGTTTA ACGCCGCA 5700 TTACGCTGAC ATTTGCACCG ATTGTGACTCGCGGCTATGA ATCCAGTAAA CGAGATGT 5760 GGATTGAAGG CACGGGTATC TTAGGTGGTATGAACTATCG GGTCAGCGAA GGGCCGTT 5820 AAGGGATGAA TTTCTTTCTT GCTGCCGATAAAGGGCGGGA AAAGCGCGAT GGCAGTAC 5880 TGGGCGATCG CCTGAATTAC TGGGATGTGAAAATGAGTAT TCAGTATGAC TTTATGCT 5940 AGTAAAAAAT AACGCCGGAG AGAAAAATCTCCGGCGTTTC AGATTGTTGA CAAAGTGC 6000 TTTTTTATGC CGGATGCGGC TAAACGCCTTATCCAGCCTA CAAAAACTCA TAAATTCA 6060 GTGTTGCAGG AAAAGGTAGG CCTGATAAGCGTAGCGCATC AGGCAATCTC TGGTTTGT 6120 TCAGATGAAA ACGCCGGAGT GAAAATTCTCCGGCGTTTTG GCCGTGAATT ACTGCTGC 6180 AATTGCCGGT ACAGCCGGAA CGTTAAGAGCTGGCATCGCA AACATGCCAA CAAAATCT 6240 TAACGACATT TTCTGCCCAT TTAACGTTATCTGACCGTTA GCATATTGCA GGCTGGTG 6300 GATGGTATTG TCCTGCAAGG TGGTCAGACGGAACATCTGC CCCATTGCTG ATGCACCT 6360 AACTTGCTGT TTCGCCAGTT TTTTCGCTTGATCTTCCTGA TAACCTTCCT CGCTACCT 6420 GTCATAAACT CAGTTGCCAT ATCCACCGGAATGGTCAGTT TCGCATCCAG AGATTTAA 6480 GAACGATCTA CTTCCTGCGC CAGCGTTTGCGGCGCTTCTT TAGTCGTTGC CGGATCTT 6540 AGGAACAGCG ACAGATTCAG GGCACTTTCACCCTGACTGT TTTTCCAGCT TAGCGGCG 6600 ATAGTAATCA CCGGATCGCC TTTCAGCATCAGCGGCAGGG CGCTAAAGAA GGCTTCCG 6660 ACTTTCTCCT GATAAAGTTC GGGGTTGTTGGCAATTTCTG GCTGTCGCGA CAGCGCCT 6720 GTTTGCGCGT TATATTGCTG GCTAAACTGATGCCAGGCTT CACCATCAAT CTGGCCGA 6780 TTTAAAGTCA GCTTGCCGCT GCCCAGATCCTGATTCTGTA CCTTCAGGCT GTTTAGCG 6840 TAATCCAGTT GGCTATTGAT CGTTTTACCGTCATTGACCA GATCCGATTT ACCGCTGA 6900 TCCATGCCTT CCAGCAGTGC CAGTTCTTTGCCTTCCACTG AAATGGTCAT TTTTTCCA 6960 GACAGTTTTT GATTTCCTAC ACGCTCACCAAAACTTGCCA GCGTGCTGGA ACCGTCGG 7020 TTCAGATTAT TAAAGGTCAA CTGCACTTTCTGGTTGTATT CGTTAACTGC GTCTATCC 7080 ACCACTTTGC GCCTCCCCGG AAAGGGAGATGGCTTTGCGT CTCTGTCAGC ATTTAACT 7140 AACTCGCCGC CGCTAAAGGC GACTTTTTCATCCTTTTGCT CGTAATTCAG TGGCTTGA 7200 GAAATATCGG AACTGGAATC ACCGCTGTAACCAATGCGCG AGTTAATCTC AAAAGGCG 7260 TCACCTTTTG CCATATCAAA CAGTGGTTTGCTTACTTCGT TATTAACCAG CGTGGTTT 7320 ATTGATGCCA TCGACGGGAT CAGGTTCAGTTTTTTAAGCT GGGCAAGCGG GAAGGGAC 7380 TGATCAACCG ATTCGTTGAA GATGACGCTCTGACCGCTTT TAATCCACGG ATTTTCTT 7440 CCGGCAATGG GTTTCACCAA CAGTTGCAACTGGCTGCTGA ATACGCCGCG ATGATAGT 7500 TGATAACTCA CTTCCAGGTT GGATTCAGGAGCTGTCAGTT TGAGTTGCGC GTTCGCCT 7560 GCGACCATGT CTTCGAGATG GGTTTCAATCTTCTTGCCTG TATACCATGC GCCGCCTG 7620 CAGACTACGC CTAGCGCAAC AATGACGCCTACCGCTACCA GCGATTTATT CATAATGA 7680 ATCCATAAAA TGAAATCAGG CGGACTGGCCGCCTGAAGGT GTTATAAGCC TTTAATAA 7740 TT 7742 602 amino acids amino acidsingle linear 5 Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu Ile Lys LysLeu As 1 5 10 15 Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly IleAsp Gl 20 25 30 Arg Trp Trp Glu Ser Ala Leu Gln Glu Ser Arg Ala Ile AlaVal Pr 35 40 45 Gly Ser Phe Asn Asp Gln Phe Ala Asp Ala Asp Ile Arg AsnTyr Al 50 55 60 Gly Asn Val Trp Tyr Gln Arg Glu Val Phe Ile Pro Lys GlyTrp Al 65 70 75 80 Gly Gln Arg Ile Val Leu Arg Phe Asp Ala Val Thr HisTyr Gly Ly 85 90 95 Val Trp Val Asn Asn Gln Glu Val Met Glu His Gln GlyGly Tyr Th 100 105 110 Pro Phe Glu Ala Asp Val Thr Pro Tyr Val Ile AlaGly Lys Ser Va 115 120 125 Arg Ile Thr Val Cys Val Asn Asn Glu Leu AsnTrp Gln Thr Ile Pr 130 135 140 Pro Gly Met Val Ile Thr Asp Glu Asn GlyLys Lys Lys Gln Ser Ty 145 150 155 160 Phe His Asp Phe Phe Asn Tyr AlaGly Ile His Arg Ser Val Met Le 165 170 175 Tyr Thr Thr Pro Asn Thr TrpVal Asp Asp Ile Thr Val Val Thr Hi 180 185 190 Val Ala Gln Asp Cys AsnHis Ala Ser Val Asp Trp Gln Val Val Al 195 200 205 Asn Gly Asp Val SerVal Glu Leu Arg Asp Ala Asp Gln Gln Val Va 210 215 220 Ala Thr Gly GlnGly Thr Ser Gly Thr Leu Gln Val Val Asn Pro Hi 225 230 235 240 Leu TrpGln Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Al 245 250 255 LysSer Gln Thr Glu Cys Asp Ile Tyr Pro Leu Arg Val Gly Ile Ar 260 265 270Ser Val Ala Val Lys Gly Glu Gln Phe Leu Ile Asn His Lys Pro Ph 275 280285 Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Ly 290295 300 Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Tr305 310 315 320 Ile Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr AlaGlu Gl 325 330 335 Met Leu Asp Trp Ala Asp Glu His Gly Ile Val Val IleAsp Glu Th 340 345 350 Ala Ala Val Gly Phe Asn Leu Ser Leu Gly Ile GlyPhe Glu Ala Gl 355 360 365 Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu AlaVal Asn Gly Glu Th 370 375 380 Gln Gln Ala His Leu Gln Ala Ile Lys GluLeu Ile Ala Arg Asp Ly 385 390 395 400 Asn His Pro Ser Val Val Met TrpSer Ile Ala Asn Glu Pro Asp Th 405 410 415 Arg Pro Gln Val His Gly AsnIle Ser Pro Leu Ala Glu Ala Thr Ar 420 425 430 Lys Leu Asp Pro Thr ArgPro Ile Thr Cys Val Asn Val Met Phe Cy 435 440 445 Asp Ala His Thr AspThr Ile Ser Asp Leu Phe Asp Val Leu Cys Le 450 455 460 Asn Arg Tyr TyrGly Trp Tyr Val Gln Ser Gly Asp Leu Glu Thr Al 465 470 475 480 Glu LysVal Leu Glu Lys Glu Leu Leu Ala Trp Gln Glu Lys Leu Hi 485 490 495 GlnPro Ile Ile Ile Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly Le 500 505 510His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gln Cys Ala Tr 515 520525 Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val Gl 530535 540 Glu Gln Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gln Gly Ile Le545 550 555 560 Arg Val Gly Gly Asn Lys Lys Gly Ile Phe Thr Arg Asp ArgLys Pr 565 570 575 Lys Ser Ala Ala Phe Leu Leu Gln Lys Arg Trp Thr GlyMet Asn Ph 580 585 590 Gly Glu Lys Pro Gln Gln Gly Gly Lys Gln 595 600457 amino acids amino acid single linear 6 Met Asn Gln Gln Leu Ser TrpArg Thr Ile Val Gly Tyr Ser Leu Gl 1 5 10 15 Asp Val Ala Asn Asn Phe AlaPhe Ala Met Gly Ala Leu Phe Leu Le 20 25 30 Ser Tyr Tyr Thr Asp Val AlaGly Val Gly Ala Ala Ala Ala Gly Th 35 40 45 Met Leu Leu Leu Val Arg ValPhe Asp Ala Phe Ala Asp Val Phe Al 50 55 60 Gly Arg Val Val Asp Ser ValAsn Thr Arg Trp Gly Lys Phe Arg Pr 65 70 75 80 Phe Leu Leu Phe Gly ThrAla Pro Leu Met Ile Phe Ser Val Leu Va 85 90 95 Phe Trp Val Pro Thr AspTrp Ser His Gly Ser Lys Val Val Tyr Al 100 105 110 Tyr Leu Thr Tyr MetGly Leu Gly Leu Cys Tyr Ser Leu Val Asn Il 115 120 125 Pro Tyr Gly SerLeu Ala Thr Ala Met Thr Gln Gln Pro Gln Ser Ar 130 135 140 Ala Arg LeuGly Ala Ala Arg Gly Ile Ala Ala Ser Leu Thr Phe Va 145 150 155 160 CysLeu Ala Phe Leu Ile Gly Pro Ser Ile Lys Asn Ser Ser Pro Gl 165 170 175Glu Met Val Ser Val Tyr His Phe Trp Thr Ile Val Leu Ala Ile Al 180 185190 Gly Met Val Leu Tyr Phe Ile Cys Phe Lys Ser Thr Arg Glu Asn Va 195200 205 Val Arg Ile Val Ala Gln Pro Ser Leu Asn Ile Ser Leu Gln Thr Le210 215 220 Lys Arg Asn Arg Pro Leu Phe Met Leu Cys Ile Gly Ala Leu CysVa 225 230 235 240 Leu Ile Ser Thr Phe Ala Val Ser Ala Ser Ser Leu PheTyr Val Ar 245 250 255 Tyr Val Leu Asn Asp Thr Gly Leu Phe Thr Val LeuVal Leu Val Gl 260 265 270 Asn Leu Val Gly Thr Val Ala Ser Ala Pro LeuVal Pro Gly Met Va 275 280 285 Ala Arg Ile Gly Lys Lys Asn Thr Phe LeuIle Gly Ala Leu Leu Gl 290 295 300 Thr Cys Gly Tyr Leu Leu Phe Phe TrpVal Ser Val Trp Ser Leu Pr 305 310 315 320 Val Ala Leu Val Ala Leu AlaIle Ala Ser Ile Gly Gln Gly Val Th 325 330 335 Met Thr Val Met Trp AlaLeu Glu Ala Asp Thr Val Glu Tyr Gly Gl 340 345 350 Tyr Leu Thr Gly ValArg Ile Glu Gly Leu Thr Tyr Ser Leu Phe Se 355 360 365 Phe Thr Arg LysCys Gly Gln Ala Ile Gly Gly Ser Ile Pro Ala Ph 370 375 380 Ile Leu GlyLeu Ser Gly Tyr Ile Ala Asn Gln Val Gln Thr Pro Gl 385 390 395 400 ValIle Met Gly Ile Arg Thr Ser Ile Ala Leu Val Pro Cys Gly Ph 405 410 415Met Leu Leu Ala Phe Val Ile Ile Trp Phe Tyr Pro Leu Thr Asp Ly 420 425430 Lys Phe Lys Glu Ile Val Val Glu Ile Asp Asn Arg Lys Lys Val Gl 435440 445 Gln Gln Leu Ile Ser Asp Ile Thr Asn 450 455 416 amino acidsamino acid single linear 7 Met Ala Met Ala Val Ile Cys Leu Thr Ala AlaSer Gly Leu Thr Se 1 5 10 15 Ala Tyr Ala Ala Gln Leu Ala Asp Asp Glu AlaGly Leu Arg Ile Ar 20 25 30 Leu Lys Asn Glu Leu Arg Arg Ala Asp Lys ProSer Ala Gly Ala Gl 35 40 45 Arg Asp Ile Tyr Ala Trp Val Gln Gly Gly LeuLeu Asp Phe Asn Se 50 55 60 Gly Tyr Tyr Ser Asn Ile Ile Gly Val Glu GlyGly Ala Tyr Tyr Va 65 70 75 80 Tyr Lys Leu Gly Ala Arg Ala Asp Met SerThr Arg Trp Tyr Leu As 85 90 95 Gly Asp Lys Ser Phe Ala Leu Pro Gly AlaVal Lys Ile Lys Pro Se 100 105 110 Glu Asn Ser Leu Leu Lys Leu Gly ArgPhe Gly Thr Asp Tyr Ser Ty 115 120 125 Gly Ser Leu Pro Tyr Arg Ile ProLeu Met Ala Gly Ser Ser Gln Ar 130 135 140 Thr Leu Pro Thr Val Ser GluGly Ala Leu Gly Tyr Trp Ala Leu Th 145 150 155 160 Pro Asn Ile Asp LeuTrp Gly Met Trp Arg Ser Arg Val Phe Leu Tr 165 170 175 Thr Asp Ser ThrThr Gly Ile Arg Asp Glu Gly Val Tyr Asn Ser Gl 180 185 190 Thr Gly LysTyr Asp Lys His Arg Ala Arg Ser Phe Leu Ala Ala Se 195 200 205 Trp HisAsp Asp Thr Ser Arg Tyr Ser Leu Gly Ala Ser Val Gln Ly 210 215 220 AspVal Ser Asn Gln Ile Gln Ser Ile Leu Glu Lys Ser Ile Pro Le 225 230 235240 Asp Pro Asn Tyr Thr Leu Lys Gly Glu Leu Leu Gly Phe Tyr Ala Gl 245250 255 Leu Glu Gly Leu Ser Arg Asn Thr Ser Gln Pro Asn Glu Thr Ala Le260 265 270 Val Ser Gly Gln Leu Thr Trp Asn Ala Pro Trp Gly Ser Val PheGl 275 280 285 Ser Gly Gly Tyr Leu Arg His Ala Met Asn Gly Ala Val ValAsp Th 290 295 300 Asp Ile Gly Tyr Pro Phe Ser Leu Ser Leu Asp Arg AsnArg Glu Gl 305 310 315 320 Met Gln Ser Trp Gln Leu Gly Val Asn Tyr ArgLeu Thr Pro Gln Ph 325 330 335 Thr Leu Thr Phe Ala Pro Ile Val Thr ArgGly Tyr Glu Ser Ser Ly 340 345 350 Arg Asp Val Arg Ile Glu Gly Thr GlyIle Leu Gly Gly Met Asn Ty 355 360 365 Arg Val Ser Glu Gly Pro Leu GlnGly Met Asn Phe Phe Leu Ala Al 370 375 380 Asp Lys Gly Arg Glu Lys ArgAsp Gly Ser Thr Leu Gly Asp Arg Le 385 390 395 400 Asn Tyr Trp Asp ValLys Met Ser Ile Gln Tyr Asp Phe Met Leu Ly 405 410 415 46 base pairsnucleic acid single linear 8 CGAGAATTCG AGGAGTCCAT CATGATGGAT AACATGCAGACTGAAG 46 38 base pairs nucleic acid single linear 9 GCTGAATTCAAGCTTCAGGA TGCGGTTAAG ATACCGCC 38 37 base pairs nucleic acid singlelinear 10 GACCAGGTTA CCATGGATAA CATGCAGACT GAAGCAC 37 43 base pairsnucleic acid single linear 11 GACGTGATGG TGGCTAGCGG ATGCGGTTAAGATACCGCCA ATC 43 30 base pairs nucleic acid single linear 12 GATCCACAGAATTGGTTAAC TAATCAGATG 30 30 base pairs nucleic acid single linear 13GTGTCTTAAC CAATTGATTA GTCTACTAAT 30 28 base pairs nucleic acid singlelinear 14 GATCCGGCTA TTGGTTAACC AATTTCAG 28 28 base pairs nucleic acidsingle linear 15 GCCGATAACC AATTGGTTAA AGTCTAAT 28 48 base pairs nucleicacid single linear 16 AATTCCGTTC CCAATACGCT CGAACGAACG TTCGGTTGCTTATTTTAG 48 48 base pairs nucleic acid single linear 17 GGCAAGGGTTATGCGAGCTT GCTTGCAAGC CAACGAATAA AATCCTAG 48 22 base pairs nucleic acidsingle linear 18 GATCCCATCG AACGTTCGAT GG 22 22 base pairs nucleic acidsingle linear 19 GGTAGCTTGC AAGCTACCCT AG 22

We claim:
 1. An isolated nucleic acid molecule encoding a glucuroniderepressor.
 2. The nucleic acid molecule of claim 1 wherein the nucleicacid molecule comprises the sequence presented in SEQ ID NO: 1, orhybridizes under stringent conditions to the complement of the sequencepresented in SEQ ID No: 1, and which encodes a functional glucuroniderepressor.
 3. The nucleic acid molecule of claim 1 wherein the nucleicacid molecule encodes SEQ ID NO: 2, or variant thereof and which encodesa functional glucuronide repressor.
 4. An isolated nucleic acid moleculeencoding a domain of a glucuronide repressor that binds a glucuronidaseoperator.
 5. The nucleic acid molecule of claim 4 wherein the nucleicacid molecule comprises the sequence of nucleotides 1 to 192 of SEQ IDNo: 1 or hybridizes under stringent conditions to the complement ofnucleotides 1 to 192 of SEQ ID No: 1, and which encodes a domain thatbinds a glucuronidase operator.
 6. The nucleic acid molecule of claim 4wherein the nucleic acid molecule encodes amino acids 1-63 of SEQ ID No:2 or variant thereof and which encodes a domain that binds aglucuronidase operator.
 7. An isolated nucleic acid molecule encoding adomain from a glucuronide repressor that binds a glucuronide.
 8. Thenucleic acid molecule of claim 7 wherein the nucleic acid moleculecomprises the sequence of nucleotides 1 to 192 of SEQ ID No: 1 orhybridizes under stringent conditions to the complement of nucleotides 1to 192 of SEQ ID No: 1, and which encodes a domain that binds aglucuronide.
 9. The nucleic acid molecule of claim 7 wherein the nucleicacid molecule comprises the sequence of nucleotides 192 to 462 of SEQ IDNO: 1 or variant thereof or nucleotides 195 to 585 of SEQ ID NO: 1 orvariant thereof, and which encodes a domain that binds a glucuronide.10. The nucleic acid molecule of claim 6 wherein the nucleic acidmolecule encodes amino acids 64 to 154 of SEQ ID NO:2 or variant thereofor amino acids 64 to 195 of SEQ ID NO: 2, or variant thereof, and whichencodes a domain that binds a glucuronide.
 11. An isolated glucuroniderepressor that binds to a glucuronide operator and binds to aglucuronide, wherein the binding to the operator is dependent on bindingto a glucuronide.
 12. An isolated glucuronide repressor, comprising theamino acid sequence of SEQ ID NO: 2 or variant thereof.
 13. An isolatedprotein, comprising a glucuronide binding domain from a glucuroniderepressor.
 14. The protein of claim 12 wherein the glucuronide bindingdomain sequence comprises amino acids 64 to 154 of SEQ ID NO: 2 orvariant thereof or amino acids 64 to 195 of SEQ ID NO: 2 or variantthereof
 15. An isolated protein, comprising a domain from a glucuroniderepressor that binds a glucuronidase operator.
 16. The protein of claim15 wherein the sequence of the operator binding domain comprises aminoacids 1 to 63 of SEQ ID NO: 2 or variant thereof.
 17. A fusion proteincomprising a glucuronide binding domain from a glucuronide repressor anda DNA-binding domain that binds to a selected nucleotide sequence. 18.The fusion protein of claim 17 wherein the glucuronide binding domaincomprises an amino acid sequence of amino acids 64 to 154 of SEQ ID NO:2 or variant thereof or amino acids 64 to 195 of SEQ ID NO: 2 or variantthereof
 19. The fusion protein of claim 17, further comprising atranscriptional activator domain.
 20. The fusion protein of claim 19wherein the N-terminal to C-terminal order of the domains is DNA bindingdomain-glucuronide binding domain-transcriptional activator domain. 21.The fusion protein of claim 17, further comprising a domain that bindsan aglycon of a glucuronide.
 22. A vector, comprising a nucleic acidmolecule encoding a glucuronide repressor according to any one of claims1-3.
 23. A vector, comprising a nucleic acid molecule encoding aglucuronide binding domain from a glucuronide repressor according toclaim
 7. 24. A vector, comprising a nucleic acid molecule encoding afusion protein according to claim
 17. 25. A vector according to any oneof claims 22-24 wherein the vector is an expression vector.
 26. Thevector of claim 25 wherein the vector is a binary Agrobacteriumtumefaciens plasmid vector.
 27. A transformed host cell containing anucleic acid molecule that encodes a glucuronide repressor.
 28. The hostcell of claim 27, wherein the nucleic acid molecule comprises thesequence presented in SEQ ID NO: 1, or hybridizes under stringentconditions to the complement of the sequence presented in SEQ ID No: 1,and which encodes a functional glucuronide repressor.
 29. A host celltransformed with a vector according to claims 22-24.
 30. The host cellof claim 29, wherein the cell is a plant cell.
 31. The host cell ofclaim 29, wherein the cell is an animal cell.
 32. The host cell of claim29, wherein the cell is a fungal cell.
 33. The host cell of claim 29,wherein the cell is a bacterial cell.
 34. A method for determining thepresence of a glucuronide in a sample, comprising: (a) binding aglucuronide repressor to a nucleic acid molecule comprising aglucuronide operator to form a complex; (b) contacting the complex witha sample containing a glucuronide, wherein the glucuronide binds to therepressor protein causing release of the protein from the nucleic acidmolecule; and (c) detecting release of the repressor protein.
 35. Amethod for determining the presence of a glucuronide in a sample,comprising: (a) binding a fusion protein according to claim 17 to anucleic acid molecule comprising a selected nucleic acid sequence toform a complex; (b) contacting the complex with a sample containing aglucuronide, wherein the glucuronide binds to the fusion protein causingrelease of the protein from the nucleic acid molecule; and (c) detectingrelease of the repressor protein.
 36. The method of either of claims 34or 35, wherein the protein binds a single glucuronide.
 37. The method ofclaim 36, wherein the glucuronide is glucuronide-morphine.
 38. A methodfor controlling gene expression of a transgene introduced into a hostcell, comprising: (a) transfecting or transforming the host cell with anucleic acid molecule encoding a glucuronide repressor, a glucuronideoperator, and a transgene, wherein the operator is operably linked tothe transgene; and (b) contacting the host cell with a glucuronide thatbinds to the repressor protein; wherein the glucuronide causes therepressor protein to release from the operator, thereby allowingexpression of the transgene.
 39. A method for controlling geneexpression of a transgene introduced into a host cell, comprising: (a)transfecting or transforming the host cell with a nucleic acid moleculecomprising a nucleotide sequence encoding the fusion protein accordingto claim 17, a glucuronide operator, and a transgene, wherein theoperator is operably linked to the transgene; and (b) contacting thehost cell with a glucuronide that binds to the repressor protein;wherein the glucuronide causes the repressor protein to release from thenucleotide sequence, thereby decreasing expression of the transgene. 40.The method of either of claims 38 or 39 wherein the glucuronide isselected from the group consisting of a glucuronide derivative that isnot cleaved at C1 by β-glucuronidase expressed by the cell; aglucuronide derivative that has a hydrophobic group at C6 and aglucuronide derivative that is not cleaved at C1 by β-glucuronidaseexpressed in the cell and has a hydrophobic group at C6.
 41. The methodof either of claims 38 or 39 wherein the glucuronide is a glucuronidederivative that has a methyl ester group at C6 and a thio ether linkageat C1.
 42. The method according to either of claims 38 or 39 wherein theaglycon of the glucuronide is fluorogenic or chromogenic.
 43. The methodaccording to either of claims 38 or 39 wherein the host cell is selectedfrom the group consisting of a plant cell, an animal cell, a fungal celland a bacterial cell.
 44. A method for isolating a glucuronide transportprotein, comprising: (a) transfecting or transforming a cell that doesnot transport a glucuronide with a nucleic acid molecule encoding aglucuronide repressor and a reporter gene linked to a glucuronideoperator; (b) further transfecting or transforming the cell in step (a)with an expression library from a vertebrate; (c) treating the cell instep (b) with a glucuronide; and (d) detecting expression of thereporter gene, thereby identifying a cell expressing a transportprotein.
 45. A method for isolating a glucuronide repressor protein,comprising: (a) contacting the repressor protein with a matrixconjugated with a glucuronide to bind the repressor protein to theglucuronide; and (b) eluting the repressor protein, thereby isolatingthe repressor protein.
 46. The method of claim 45, wherein theglucuronide is conjugated to the matrix through an amide linkage at C6.47. The method of claim 45, wherein the glucuronide isphenylthio-β-glucuronide.
 48. The method of claim 45, wherein theglucuronide is saccharolactone.
 49. The method of claim 48, wherein thematrix is agarose.